projectdiscovery / katana

A next-generation crawling and spidering framework.
MIT License
11.22k stars 595 forks source link

Katana Performs Poorly on Nuxt JS websites #579

Open mitchgreen opened 1 year ago

mitchgreen commented 1 year ago

katana version:

Katana Version 1.0.3

Current Behavior:

Currently when katana headless -hl or standard (without -hl) is run against a website that leverages the Nuxt JS framework, Katana performs poorly, missing obviously linked pages, returning few or no meaningful results for the website.

Expected Behavior:

Examples of websites where I would expect Katana to perform better but it does not, likely due to the site leveraging Nuxt JS:

  1. gitlab.com
  2. cdnjs.com
  3. upwork.com
  4. openai.com

Steps To Reproduce:

katana.exe -u 'https://cdnjs.com' -hl

Anything else:

Nothing else to add at this time, will update with more information as it is available.

Mzack9999 commented 10 months ago

I think this might be mostly related to the way nuxt.js renders the page (ref. https://stackoverflow.com/questions/48577766/nuxt-sites-not-getting-crawled) as apparently the DOM is not fully generated but only upon UI interactions (such as scroll, click, etc) which katana is not currently supporting