ScrapeGraphAI / Scrapegraph-ai

Python scraper based on AI
https://scrapegraphai.com
MIT License
14.5k stars 1.18k forks source link

ValueError: No HTML body content found, please try setting the 'headless' flag to False in the graph configuration. (Urgent help wanted) #308

Closed MalakW closed 3 months ago

MalakW commented 3 months ago

Initially, it worked and provided output, but it has stopped working. I have been trying to resolve this error for three days. Despite using a VPN and adding money to OpenAI, the error persists.

Screenshot 2024-05-27 122100
PeriniM commented 3 months ago

Hey @MalakW, the headless flag should not be inside the "browser" key in the graph configuration. Let me know

MalakW commented 3 months ago

the headless flag should not be inside the "browser" key in the graph configuration.

like this you mean?

image
MalakW commented 3 months ago

hello, any help regarding this issue?

wangdongpeng1 commented 3 months ago

look this ^_^ hope help for you

graph_config = {
    "llm": {
        "api_key": "<Your API KEY>",
        "model": "oneapi/qwen-turbo",
        "base_url": "http://127.0.0.1:13000/v1", 
    },
    "embeddings": {
        "model": "ollama/nomic-embed-text",
        "base_url": "http://127.0.0.1:11434",
    },
    "headless": False
}
VinciGit00 commented 3 months ago

hi @MalakW the reason is because you have not installed playwright, look this collar to see how Is implemented link

Pankti99 commented 3 months ago

Hey @MalakW You can try this.

import asyncio
import sys
from playwright.async_api import async_playwright

graph_config = {
    "llm": {
        "model_instance": llm_model_instance
    },
    "embeddings": {
        "model_instance": embedder_model_instance
    },
    "browser": {
        "headless": False
    }
}
def scrape_website(prompt, source):
    print(prompt, source)
    # Ensure the event loop policy is set correctly for Windows
    if sys.platform == "win32":
        asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy())

    # Create the SmartScraperGraph instance
    smart_scraper_graph = SmartScraperGraph(
        prompt=prompt,
        source=source,
        config=graph_config
    )

    result = smart_scraper_graph.run()