ScrapeGraphAI / Scrapegraph-ai

Python scraper based on AI
https://scrapegraphai.com
MIT License
15.79k stars 1.29k forks source link

looking for sources of each item #803

Closed silgon closed 1 day ago

silgon commented 1 day ago

I was testing the search with a schema example. However, I wanted to know the source for each one of the results. I figured it might be as simple as changing the dish class to:

class Dish(BaseModel):
    name: str = Field(description="The name of the dish")
    description: str = Field(description="The description of the dish")
    source: str = Field(description="the url of the source")

However, it does not work, the source field gives me sometimes NA, or sometimes something partially informative. Is there any way to get the information of each of the request in the results instead of in the end of the result variable?

VinciGit00 commented 1 day ago

can you show me the code?

silgon commented 1 day ago

Sure, I slightly modified it the example. But it's just the env variables as you will see.

import os
from typing import List
from pydantic import BaseModel, Field
from scrapegraphai.graphs import SearchGraph

class Dish(BaseModel):
    name: str = Field(description="The name of the dish")
    description: str = Field(description="The description of the dish")
    source: str = Field(description="the url of the source")

class Dishes(BaseModel):
    dishes: List[Dish]

# ************************************************
# Define the configuration for the graph
# ************************************************

openai_key = os.getenv("OPENAI_API_KEY")

graph_config = {
    "llm": {
        "api_key": openai_key,
        "model": "openai/gpt-4o-mini"
    },
    "max_results": 5,
    "verbose": True,
}

# ************************************************
# Create the SearchGraph instance and run it
# ************************************************

search_graph = SearchGraph(
    prompt="List me Chioggia's famous dishes",
    config=graph_config,
    schema=Dishes
)

result = search_graph.run()
print(result)

This is an example of the result variable. image BTW: I'm using version 1.27.0

madguy02 commented 1 day ago

@silgon i tried your code and got this response: {'dishes': [{'name': 'Sardines in Saòre', 'description': "Sardines fried and left in a 'carpione' marinade using chioggian white onions.", 'source': 'https://www.visitchioggia.com/en/taste/chioggian-cuisine/sardines-in-saor/'}, {'name': 'Bigoli in salsa', 'description': 'A traditional pasta dish served with a sauce made from anchovies and onions.', 'source': 'https://www.visitchioggia.com/en/taste/chioggian-cuisine/bigoi-in-salsa/'}, {'name': 'Stewed cuttlefish', 'description': 'Cuttlefish cooked in ink or stewed, a popular seafood dish.', 'source': 'https://www.visitchioggia.com/en/taste/chioggian-cuisine/stewed-cuttlefish/'}, {'name': 'Moeche frite', 'description': 'Crispy molting crabs, a delicacy in Chioggian cuisine.', 'source': 'https://www.visitchioggia.com/en/taste/chioggian-cuisine/moeche-frite/'}, {'name': 'Seafood risotto', 'description': 'A creamy risotto made with fresh seafood, depending on the catch of the day.', 'source': 'https://www.visitchioggia.com/en/taste/chioggian-cuisine/seafood-risotto/'}, {'name': 'Cicchetti', 'description': 'Small snacks or side dishes typically served in bars in Venice and the surrounding areas, enjoyed over a casual lunch.', 'source': 'https://www.mykindofitaly.com'}, {'name': 'Gran Saòr', 'description': 'A classic Veneto preparation for seafood that includes sole, mantis shrimp, scallop, squid, and sardine, served with grilled white polenta.', 'source': 'https://www.mykindofitaly.com'}, {'name': 'Tortelloni', 'description': 'Pasta stuffed with various shrimps, scallops, and local radicchio.', 'source': 'https://www.mykindofitaly.com'}, {'name': 'Mixed Grilled Seafood', 'description': 'A dish featuring a variety of local crustaceans, grilled to perfection.', 'source': 'https://www.mykindofitaly.com'}, {'name': 'Cannolicchi', 'description': 'A type of clam that is often served fresh and can be quite lively when served.', 'source': 'https://www.mykindofitaly.com'}, {'name': 'Fritto misto', 'description': 'A mixed fried seafood dish, popular in coastal regions.', 'source': 'https://www.tasteatlas.com/fritto-misto'}, {'name': 'Baccalà mantecato', 'description': 'A creamy spread made from salted cod, typical of Venetian cuisine.', 'source': 'https://www.tasteatlas.com/baccala-mantecato'}, {'name': 'Fritto misto di pesce', 'description': 'A variation of fritto misto, specifically made with fish.', 'source': 'https://www.tasteatlas.com/fritto-misto-di-pesce'}, {'name': "Linguine all'astice", 'description': 'Linguine pasta served with lobster, a luxurious seafood dish.', 'source': 'https://www.tasteatlas.com/linguine-allastice'}, {'name': 'Spaghetti alle vongole', 'description': 'Spaghetti served with clams, a classic Italian seafood pasta dish.', 'source': 'https://www.tasteatlas.com/spaghetti-alle-vongole'}, {'name': 'Boboli de vida', 'description': 'Snails with olive oil and parsley.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Granseole', 'description': 'Boiled crab, seasoned with olive oil, lemon and spices.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Sardele salae', 'description': 'Raw sardines or anchovies preserved in layers of salt and served with olive oil.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Bibarasse in cassopipa', 'description': 'Clams cooked with onions.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Baked scallops', 'description': 'Cooked with garlic and parsley in the shell with the addition of brandy.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Broeto', 'description': 'Slices of different kind of cooked fish or shellfish in a sauce of oil, onion and vinegar. Served with croutons.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Risoto de sepe', 'description': 'Rice with fried or boiled cuttlefish, with the addition of oil and parsley.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Spagheti co le bibarasse', 'description': 'Spaghetti served with clams.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Risoto a la pescatora', 'description': 'Rice with pieces of fish cooked in their sauce.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Risoto a la ciosota', 'description': 'Rice cooked in a sauce of various specialties of fried and boiled fish with garlic, Parmesan and white wine.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Bisato in tecia', 'description': 'Eel in tomato sauce and white wine.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Sepe nere', 'description': 'Cuttlefish, boiled in a mixture of onion and garlic, with the addition of white wine, tomatoes and spices.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Sardele in saore', 'description': 'Fried sardines or anchovies preserved in a sauce of onions and vinegar.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Moleche frite', 'description': 'Crabs fried in abundant oil.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Risi e vuovi', 'description': 'Cuttlefish eggs boiled and seasoned with olive oil, vinegar and spices.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Sepe in umido', 'description': 'Cuttlefish in tomato sauce and spices.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Pesse rosto incovercià', 'description': 'Many specialty of fish roasted over coals and heated in a covered pan with olive oil, vinegar, wine and garlic.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Radicchio rosso', 'description': 'Can be served with oil and salt; barbecued, or fried.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Papini', 'description': 'Thin and hard donuts typical of the Easter.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Sugoli', 'description': 'Cream made of black grapes and flour.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Smegiassa', 'description': 'Cake made with a mixture of black honey, flour, pumpkin, raisins, pine nuts and sugar.', 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}, {'name': 'Bossolà', 'description': "Typical speciality recognized as 'bread of Chioggia', shaped as a ring, fragrant and crisp, easy to maintain.", 'source': 'http://www.sottomarina.net/gastronomia_uk.htm'}], 'sources': ['https://www.visitchioggia.com/en/taste/chioggian-cuisine/', 'https://www.visitchioggia.com/en/taste/', 'https://www.mykindofitaly.com/post/chioggia-authentic-italian-experience', 'https://www.tasteatlas.com/chioggia', 'https://www.sottomarina.net/gastronomia_uk.htm']}

the problem you are seeing is because you are using the version 1.27.0, please use the latest version from here: https://github.com/ScrapeGraphAI/Scrapegraph-ai/releases, pip install scrapegraphai==1.30.0-beta.4

silgon commented 1 day ago

@madguy02 Thanks! you are completely correct. 🎉