ScrapeGraphAI / Scrapegraph-ai

Python scraper based on AI
https://scrapegraphai.com
MIT License
15.98k stars 1.3k forks source link

AttributeError: 'FetchNode' object has no attribute 'update_state' #762

Open MahdiSepiRashidi opened 1 month ago

MahdiSepiRashidi commented 1 month ago

Describe the bug in File "scrapegraphai\nodes\fetch_node.py" , clas FetchNode nor in BaseNode which is the parent of the class FerchNode there is no method defined as update_state(), although it has been used in both FetchNode.handle_local_source() and FetchNode.handle_web_source()

full error: File "scrapegraphai\nodes\fetch_node.py", line 233, in handle_local_source return self.update_state(state, compressed_document) ^^^^^^^^^^^^^^^^^ AttributeError: 'FetchNode' object has no attribute 'update_state'

Desktop (please complete the following information):

VinciGit00 commented 1 month ago

Please share me the code

MahdiSepiRashidi commented 1 month ago

Please share me the code

from scrapegraphai.graphs import SmartScraperGraph

graph_config = { "llm": { "model": "ollama/llama3.1", "temperature": 0, "format": "json", # Ollama needs the format to be specified explicitly "base_url": "http://localhost:11434", # set Ollama URL }, "embeddings": { "model": "ollama/nomic-embed-text", "base_url": "http://localhost:11434", # set Ollama URL }, "verbose": True, }

string = "any str"

user_prompt ="get all products in this"

Create a SmartScraperGraph object

smart_scraper_graph = SmartScraperGraph( prompt=user_prompt, source=string, config=graph_config )

result = smart_scraper_graph.run() print(result)

VinciGit00 commented 1 month ago

ok you should try with the new scripts. It is outdated

jrc commented 1 month ago

This is the relevant example code: https://github.com/ScrapeGraphAI/Scrapegraph-ai/blob/main/examples/local_models/smart_scraper_ollama.py

MahdiSepiRashidi commented 1 month ago

ok you should try with the new scripts. It is outdated

Thanks for the answer. But I am not getting what you mean by "new scripts", since the last git version (https://github.com/ScrapeGraphAI/Scrapegraph-ai/blob/main/scrapegraphai/nodes/fetch_node.py) also seems not having any method as FetchNode.update_state(). I managed to fix the error by changing the method FetchNode.handle_local_source() in https://github.com/ScrapeGraphAI/Scrapegraph-ai/blob/main/scrapegraphai/nodes/fetch_node.py a bit. I assumed the handle_web_source is working properly and mimicked its way of updating the "state". So instead of return self.update_state(state, compressed_document) I used state.update({self.output[0]: compressed_document,}) and the problem fixed.

VinciGit00 commented 1 month ago

Ok please cann you make the pr?

MahdiSepiRashidi commented 1 month ago

Ok please cann you make the pr?

My pleasure. I made the pr ready and I only need the permission to publish on the repository.

YohannBlack commented 4 weeks ago

Hi, Has any solution been found for this issue ? I am having the same problem when trying to scrape certain website.

VinciGit00 commented 3 weeks ago

@MahdiSepiRashidi you have to do the pull request on the ScrapegraphAI org, I have not seen it

matheus-rossi commented 1 week ago

scrapegraphai = "v1.30.0-beta.4"

config = {
        "llm": {
            "api_key": settings.ANTHROPIC_API_KEY,
            "model": "anthropic/claude-3-5-haiku-20241022", 
            "timeout": 60,
            "temperature": 0,
            "max_tokens": 8192
        },
        "verbose": verbose
    }
  response = SmartScraperGraph(
      prompt="MY_PROMPT",
      source=cleaned_html, -- MY HTML GOES HERE
      config=config,
  ).run()

gives me the following error

ERROR | src.utils.aiutils:_scrape_content:284 - 'FetchNode' object has no attribute 'update_state'