unclecode / crawl4ai

🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper
Apache License 2.0
16.39k stars 1.2k forks source link

js_code execution doesn't happen before taking screenshot #193

Closed RohithReddy20 closed 4 weeks ago

RohithReddy20 commented 1 month ago

Hi UncleCode,

I noticed that in crawl4ai, the js_code execution doesn't seem to happen before the screenshot is taken. It looks like the screenshot is captured without waiting for the JavaScript code to run, which might affect the page state when the screenshot is taken.

Could you please check if there's a way to ensure the js_code runs first before the screenshot is captured?

Thanks!

unclecode commented 4 weeks ago

Hi @RohithReddy20 , there's one thing - the execution of the JavaScript definitely starts before taking the screenshot. But if the JavaScript execution is asynchronous, then you must pass some criteria for waiting for something to happen, and that parameter name is wait_for.

You can, for example, pass a CSS selector (e.g. css:article#main), then the whole page waits until the related element come to visibility, or you can pass a JavaScript code (e.g. js:()=>{ return true or false}), or you can use a hook - for example, you can use the on_execution_started hook. There you can check the page and wait for some events or simply just pass a delay like five seconds, and then you will be able to get the whole thing.

Anyway if you had any problem, you can share with me the urk and I will create a code sample with and share it back with you.