Closed filipnyquist closed 9 months ago
@filipnyquist, The issue you're experiencing is likely due to the server struggling with multiple concurrent requests. When you set concurrency to 1, the server handles one request at a time, which is more manageable. To resolve this, you can increase the timeout value to give the server more time to respond to each request. This can be done by adjusting the -timeout
flag; -timeout 15
worked for me. Remember, the optimal configuration depends on the specific server you're crawling, and you might need to experiment with different settings to find what works best. Let us know if this works for you or if you have any further questions.
Closing this. Feel free to reopen if the issue persists.
katana version:
[INF] Current katana version v1.0.2 (latest)
Current Behavior:
While scanning sites in headless mode, using the default concurrency of 10, some URLs are not picked up from pages as the page isn´t in "view" by for example the React DOM, which only seems to render it when the page is "viewed".
Expected Behavior:
In headless mode, the page should be fully viewed for all links and allow the functions to run before moving on to next URL.
Steps To Reproduce:
katana -u https://ginandjuice.shop/catalog -hl
(an example page which renders this with ReactDOM, see attatched code in "Anything else".katana -u https://ginandjuice.shop/catalog -hl -c 1
Anything else:
The specific part of the code on the '/catalog' page that does not get picked up: