ScrapeGraphAI / Scrapegraph-ai

Python scraper based on AI
https://scrapegraphai.com
MIT License
13.25k stars 1.01k forks source link

Adding message parameter support for OpenAI models #382

Open Hkllopp opened 1 month ago

Hkllopp commented 1 month ago

OpenAI models uses the message parameter for the prompt. ScrapegraphAI also use this parameter to link to the prompt argument on scrapper invocation. However, sometimes when using openAI models, we need to make multiple prompts to better guide the response (like in this article and this documentation).

Is it possible to replace the standart scrapGraphAI prompt when providing message argument in the graph_config ?

Example :


import os
from scrapegraphai.graphs import SmartScraperGraph

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

system_prompt = "Provide output in valid JSON."
user_prompt = "List me all the news article with a brief description for each one."

graph_config = {
    "llm": {
        "api_key": OPENAI_API_KEY,
        "model": "gpt-3.5-turbo",
        "response_format": {"type": "json_object"},
        "seed": 0,
        "temperature": 0,
        "messages": [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt},
        ],
    },
    "verbose": True,
}

smart_scraper_graph = SmartScraperGraph(
    prompt=user_prompt,
    # also accepts a string with the already downloaded HTML code
    source="https://perinim.github.io/projects",
    config=graph_config,
)

# TypeError: openai.resources.chat.completions.Completions.create() got multiple values for keyword argument 'messages'```
VinciGit00 commented 1 month ago

ok, it could be an idea but please provide me a use case for the system prompt. The output is already in the json format

ehecatl commented 1 month ago
Screenshot 2024-06-14 at 3 19 56 p m

Well, another example is that users that want to use groq with llama3-70b they need to mention the word JSON in the system-prompt message, its mandatory.

VinciGit00 commented 1 month ago

Even if the output format is in json?

ehecatl commented 4 weeks ago

Yes, according to the groq playground, the system message should contain the word JSON

El El sáb, 15 de jun de 2024 a la(s) 12:38 a.m., Marco Vinciguerra < @.***> escribió:

Even if the output format is in json?

— Reply to this email directly, view it on GitHub https://github.com/VinciGit00/Scrapegraph-ai/issues/382#issuecomment-2169163134, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADSHIVDNBCXIYAE2PY3ZGLZHPOPNAVCNFSM6AAAAABJKD63P6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRZGE3DGMJTGQ . You are receiving this because you commented.Message ID: @.***>

VinciGit00 commented 4 weeks ago

give me an example please