ScrapeGraphAI / Scrapegraph-ai

Python scraper based on AI
https://scrapegraphai.com
MIT License
14.41k stars 1.17k forks source link

i am getting the below while running for ollama model #616

Closed MSR2201 closed 5 days ago

MSR2201 commented 2 weeks ago

Traceback (most recent call last): File "D:\Five minutes\firecrawl\scrapegraph\app1.py", line 1, in from scrapegraphai.graphs import SmartScraperGraph File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\graphs__init__.py", line 5, in from .abstract_graph import AbstractGraph File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\graphs\abstract_graph.py", line 16, in

from ..utils.logging import set_verbosity_warning, set_verbosity_info File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\utils\__init__.py", line 13, in File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\utils\convert_to_md.py", line 5, in from html2text import HTML2Text File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\html2text\__init__.py", line 11, in from . import config ImportError: cannot import name 'config' from partially initialized module 'html2text' (most likely due to a circular import) (D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\html2text\__init__.py) ignore the name of the folder the error is persistent what is happening ?
VinciGit00 commented 2 weeks ago

ok can you show the code please?

MSR2201 commented 2 weeks ago

from scrapegraphai.graphs import SmartScraperGraph from scrapegraphai.utils import prettify_exec_info

graph_config = { "llm": { "model": "ollama/gemma2:2b", "temperature": 1, "format": "json", # Ollama needs the format to be specified explicitly "model_tokens": 100, # depending on the model set context length "base_url": "http://localhost:11434", # set ollama URL of the local host (YOU CAN CHANGE IT, if you have a different endpoint }, "embeddings": { "model": "ollama/nomic-embed-text", "temperature": 0, "base_url": "http://localhost:11434", # set ollama URL } }

smart_scraper_graph = SmartScraperGraph( prompt="List me all the projects with their description.",

also accepts a string with the already downloaded HTML code

source="https://perinim.github.io/projects", config=graph_config )

result = smart_scraper_graph.run() print(result)

this is the code which i used

VinciGit00 commented 2 weeks ago

Look at the new examples

MSR2201 commented 2 weeks ago

Getting the same error ..

Traceback (most recent call last): File "D:\Five minutes\firecrawl\scrapegraph\app.py", line 4, in from scrapegraphai.graphs import SmartScraperGraph File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\graphs__init__.py", line 5, in from .abstract_graph import AbstractGraph File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\graphs\abstract_graph.py", line 16, in

from ..utils.logging import set_verbosity_warning, set_verbosity_info File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\utils\__init__.py", line 13, in from .convert_to_md import convert_to_md File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\scrapegraphai\utils\convert_to_md.py", line 5, in import html2text File "D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\html2text\__init__.py", line 11, in from . import config ImportError: cannot import name 'config' from partially initialized module 'html2text' (most likely due to a circular import) (D:\Five minutes\firecrawl\scrapegraph\myenv\lib\site-packages\html2text\__init__.py) **when i run the basic code without changing the code** import json from typing import List from langchain_core.pydantic_v1 import BaseModel, Field from scrapegraphai.graphs import SmartScraperGraph from scrapegraphai.utils import prettify_exec_info class Project(BaseModel): title: str = Field(description="The title of the project") description: str = Field(description="The description of the project") class Projects(BaseModel): projects: List[Project] graph_config = { "llm": { "model": "ollama/gemma2:2b", "temperature": 0, "format": "json", # Ollama needs the format to be specified explicitly # "base_url": "http://localhost:11434", # set ollama URL arbitrarily },"verbose": True, "headless": False } smart_scraper_graph = SmartScraperGraph( prompt="List me all the projects with their description", source="https://perinim.github.io/projects/", schema=Projects, config=graph_config ) result = smart_scraper_graph.run() print(json.dumps(result, indent=4))
VinciGit00 commented 2 weeks ago

ok what's is your config? can you try to use llama3?

MSR2201 commented 1 week ago

my laptop is a cpu based but it should not be a problem with gemma llama is taking too much space

VinciGit00 commented 1 week ago

ok please update