ScrapeGraphAI / Scrapegraph-ai

Python scraper based on AI
https://scrapegraphai.com
MIT License
15.82k stars 1.29k forks source link

[Still present in latest version] AttributeError: 'FetchNode' object has no attribute 'update_state' #801

Open aleenprd opened 5 days ago

aleenprd commented 5 days ago

Describe the bug I can't even run this example: https://github.com/ScrapeGraphAI/Scrapegraph-ai/blob/main/examples/openai/scrape_plain_text_openai.py

To Reproduce Steps to reproduce the behavior:

  1. Clone the repo and try running the example or any text.
Levyathanus commented 4 days ago

Hello, I've tried to reproduce your problem but I couldn't manage to do it. The issue was solved with this commit: add html source support for source, please try to git pull and update to the latest version.

VinciGit00 commented 3 days ago

thank you @Levyathanus

FayzulSaimun commented 3 days ago

I tired this commit and got the same error. AttributeError: 'FetchNode' object has no attribute 'update_state' My code:

graph_config = {
    "llm": {
        "api_key": os.getenv("OPENAI_APIKEY"),
        "model": "openai/gpt-4o",
    },
    "html_mode": True,
    "verbose": True,
    "headless": False,
}

local_html_path = "test.html"
with open(local_html_path, 'r') as file:
    local_html_content = file.read()

smart_scraper_graph = SmartScraperGraph(
    prompt="""Extract the school information as FullName, Short Name, Public address, description, Zip, State, Stree, Phone, Email, Website, and extract all available programs urls. 
        Program urls should be full urls.""",
    source=local_html_content,
    config=graph_config
)

Os: Windows 10

matheus-rossi commented 1 day ago

Getting the same error here, after upgrading from 1.20.x to 1.30.x

Did you manage to solve it @FayzulSaimun ?

FayzulSaimun commented 1 day ago

No, Not working, Tried the beta version also. @matheus-rossi

Levyathanus commented 1 day ago

Hello, please try to checkout on the precise commit of the fix and re-try first to see if it is an update problem. You can do that by running: git checkout 5100fbb01746379395a3500eae7eeeb4870be373 or with the latest version on the pre/beta branch. If it still doesn't work, can u please post the content of your test.html file used in your code?

matheus-rossi commented 1 day ago

Hello, please try to checkout on the precise commit of the fix and re-try first to see if it is an update problem. You can do that by running: git checkout 5100fbb01746379395a3500eae7eeeb4870be373 or with the latest version on the pre/beta branch. If it still doesn't work, can u please post the content of your test.html file used in your code?

I've tried all versions from v1.29.x and v1.30.x, and in these releases, this fix is not applied.

So, I got the code from that specific commit and manually merged it into my local environment, which fixed the problem.

Basically, the latest versions released do not have this fix, which causes some confusion.

VinciGit00 commented 1 day ago

Hi @matheus-rossi can you please add it and make a pr?