spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.27k stars 837 forks source link

Huggingface /generate in path to model is not expected #1727

Open nastyabakhshieva opened 1 day ago

nastyabakhshieva commented 1 day ago

Bug description I was experimenting on Spring AI and was looking for a free model for ChatApi implementation. Upon investigation I found I can use HuggingFace microsoft/Phi-3-mini-4k-instruct model. So, I added the following dependency in my project and started to investigate:

// POM version is 1.0.0-M3
implementation 'org.springframework.ai:spring-ai-huggingface-spring-boot-starter'

Similarly, I added in application yaml:

spring:
  ai:
    huggingface:
      chat:
        api-key: hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        url: https://api-inference.huggingface.co/models/microsoft/Phi-3-mini-4k-instruct

Once done, I called HuggingfaceChatModel

Unfortunately, The following exception was received:

org.springframework.web.client.HttpClientErrorException$NotFound: 404 Not Found: "{"error":"Model microsoft/Phi-3-mini-4k-instruct/generate does not exist"}"

What I found is that, /generate postfix is appended in the code & hardcoded in org.springframework.ai.huggingface.api.TextGenerationInferenceApi#generateWithHttpInfo, line 83 Whereas, with postman everything works as expected as per hugging face documentation - https://huggingface.co/docs/api-inference/tasks/text-generation?code=curl

Environment Java version - 21 Build tool - gradle 8.3 Spring AI version - 1.0.0-M3

Steps to reproduce Just a simple ChatClient setup with hugging face model provided along with access key generated in hugging face account

Expected behavior /generate is not hardcoded hence can put whatever url is needed

Minimal Complete Reproducible example I can't put my access key here, but all you need to do is

  1. Create a simple project
  2. Add huggingface-ai dependency
  3. Define hugging face url & access key in application properties file
  4. Call the ChatClient
jitokim commented 1 day ago

I removed the path from the generated client file using the openapi.json and ran it because of this issue.

It seems like I might need to get the latest version of the OpenAPI file from Hugging Face. (I haven’t fully analyzed it yet.)

As a temporary workaround, it works if remove the path in the URI component builder for the path.