ScrapeGraphAI / Scrapegraph-ai

Python scraper based on AI
https://scrapegraphai.com
MIT License
16.05k stars 1.31k forks source link

Support for LocalAI #202

Closed mudler closed 6 months ago

mudler commented 6 months ago

Hey :wave: LocalAI (https://github.com/mudler/LocalAI) author here - nice project!

Is your feature request related to a problem? Please describe. I'd like to run this locally with LocalAI - only Ollama seems to be supported.

Describe the solution you'd like LocalAI provides a drop-in API compatible with OpenAI - the only requirement is to be able to specify a base API url endpoint for the client to hit. If Scrapegraph could let the user specify a base url for OpenAI would be enough to make it compatible with LocalAI

Describe alternatives you've considered N/A

Additional context n/a

VinciGit00 commented 6 months ago

try to change the endpoint of the local model to the port you want to listen on localai

PeriniM commented 6 months ago

Hey @mudler really cool project! I saw in the OpenAI class definition that it is available the openai_api_base param.

image

You could try to add it on the graph_configuration like:

graph_config = { "llm": { "api_key": openai_key, "model": "gpt-3.5-turbo", "openai_api_base": "local-ai:8080", }, "verbose": True, }

Other options are creating the standard LLM interface compatible with Langchain Reddit - CustomLLM

If you want to collaborate send a DM!

PeriniM commented 6 months ago

@mudler Ok i got it working by setting "openai_api_base": "http://localhost:8080" the only problem is that it still requires a valid OpenAI api key to pass on the constructor, even though it is not used. By checking LocalAI endpoints I noticed they are the same as Ollama so I guess you could try to use:

graph_config = {
    "llm": {
        "model": "ollama/gpt-4",
        "temperature": 0,
        # "format": "json",  # Ollama needs the format to be specified explicitly
        "base_url": "http://localhost:8080", # set ollama URL arbitrarily
    },
    "embeddings": {
        "model": "ollama/text-embedding-ada-002",
        "temperature": 0,
        "base_url": "http://localhost:8080",  # set ollama URL arbitrarily
    },
    "verbose": True,
}

Right now you can't specify a gpt name inside ollama/ so if you can rename the model with another name it would be better

mudler commented 6 months ago

@mudler Ok i got it working by setting "openai_api_base": "http://localhost:8080" the only problem is that it still requires a valid OpenAI api key to pass on the constructor, even though it is not used. By checking LocalAI endpoints I noticed they are the same as Ollama so I guess you could try to use:

graph_config = {
    "llm": {
        "model": "ollama/gpt-4",
        "temperature": 0,
        # "format": "json",  # Ollama needs the format to be specified explicitly
        "base_url": "http://localhost:8080", # set ollama URL arbitrarily
    },
    "embeddings": {
        "model": "ollama/text-embedding-ada-002",
        "temperature": 0,
        "base_url": "http://localhost:8080",  # set ollama URL arbitrarily
    },
    "verbose": True,
}

Right now you can't specify a gpt name inside ollama/ so if you can rename the model with another name it would be better

That's great! thanks for giving it a shot! actually it's fine to specify a "fake" token - LocalAI would ignore it if not explicitly set so that should just work - do you want me to open up a PR to update the docs?

PeriniM commented 6 months ago

Yes, thanks the config to also use your embeddings would be:

graph_config = {
    "llm": {
        "api_key": openai_key,
        "model": "gpt-4",
        "temperature": 0,
        "base_url": "http://localhost:8080",
    },
    "embeddings": {
        "model": "ollama/text-embedding-ada-002",
        "temperature": 0,
        "base_url": "http://localhost:8080",
    },
    "verbose": True,
}