unclecode / crawl4ai

🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper
Apache License 2.0
16.38k stars 1.2k forks source link

Error processing element: 'NoneType' object has no attribute 'name' #268

Closed Olliejp closed 2 days ago

Olliejp commented 1 week ago

I often see this error. See this example with the following site:

LOG] 🚀 Crawling done for https://nbispa.com/, success: True, time taken: 5.12 seconds Error processing element: 'NoneType' object has no attribute 'name' [ERROR] 🚫 arun(): Failed to crawl https://nbispa.com/, error: Process HTML, Failed to extract content from the website: https://nbispa.com/, error: 'NoneType' object has no attribute 'find_all'

I have been trying to play around with my JS code and wait_for strategy but can't seem to find a clear reason for these errors. Do you have any ideas? Thanks, and thanks for releasing this!

async def scrape_landing_page(url: str) -> str:
    async with AsyncWebCrawler(verbose=False, always_by_pass_cache=True) as crawler:
        result = await crawler.arun(
            url=url,
            magic=True,
            remove_overlay_elements=True,
            page_timeout=60000,
            delay_before_return_html=2.0,
            wait_for="js:() => document.readyState === 'complete' && document.querySelector('body')",
            js_code=[
                """
                const cookieButton = document.querySelector('.cookie-accept');
                    if (cookieButton) cookieButton.click();
                """,
                "window.scrollTo(0, document.body.scrollHeight);",
                """
                var loadMoreButton = document.querySelector('.load-more');
                if (loadMoreButton) {
                    loadMoreButton.click();
                }
                """
            ]
        )
        return result
unclecode commented 2 days ago

@Olliejp Please refer to https://github.com/unclecode/crawl4ai/issues/266