soxoj / maigret

🕵️‍♂️ Collect a dossier on a person by username from thousands of sites
https://t.me/osint_maigret_bot
MIT License
10.26k stars 794 forks source link

create solution for SPA's #579

Open fen0s opened 2 years ago

fen0s commented 2 years ago

currently Maigret may struggle with single page applications because of javascript reliance. current possible solution: use browser emulators (Selenium, Puppeteer) to have proper JS loaded pages pros:

  1. proper browser experience which can increase the accuracy of username search on certain websites
  2. could solve another problem, bypassing cloudflare

cons:

  1. heavy dependencies with big size since webdriver is required, perhaps make this dependency optional?
  2. painfully slow operations which doesn't allow to use it for all sites, but only for top ones
  3. no asynchronous operations, could be replaced by threading?
soxoj commented 2 years ago

Can we explore available solutions of that problem? E.g. https://github.com/qeeqbox/social-analyzer

fen0s commented 2 years ago

social analyzer uses selenium for slow scans too, as you can see here: https://github.com/qeeqbox/social-analyzer/blob/main/modules/slow-scan.js implementation here is pretty good, but still, it has all the same cons/pros as using selenium in any case