Closed petric3 closed 7 months ago
Hi @petric3, thanks for the issue! I think the problem you're seeing is caused by a couple of things.
The first is that meta-llama/Meta-Llama-3-8B is unfortunately not available for free (serverless) access via the Inference API. From the model page:
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.
The second issue is that the llama
models seem to all be set up for text-generation rather than text-classification. You might try one of the other text-classification models hosted on HF: https://huggingface.co/models?pipeline_tag=text-classification&sort=downloads
Thanks for letting me know that the example hangs indefinitely when given a llama
model. I've opened #34 to track that issue. Unfortunately it may be awhile before I'm able to get to it. Life is busy right now with my day job and an upcoming house-move. :)
As for why llama
models don't work with the text-generation example, it seems that they only accept single inputs, rather than lists of inputs.
# 1) Does not work
curl https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct \
-X POST \
-d '{"inputs": ["The quick brown fox", "jumps over the lazy dog"]}' \
-H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
-H 'Content-Type: application/json'
# 2) Works
curl https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct \
-X POST \
-d '{"inputs": "The quick brown fox"}' \
-H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
-H 'Content-Type: application/json'
# 3) Works
curl https://api-inference.huggingface.co/models/gpt2 \
-X POST \
-d '{"inputs": ["The quick brown fox", "jumps over the lazy dog"]}' \
-H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
-H 'Content-Type: application/json'
Currently, hfapigo
treats all inputs as lists because the docs specify them as being supported.
Return value is either a dict or a list of dicts if you sent a list of inputs
Clearly that isn't the case for all models though. I've opened #35 to track that issue.
@Kardbord Thank you for the explanation, it makes sense. Also thanks for the links to the dedicated endpoints, may come handy. Best of wishes with your house move 🌞
Thank you for your program.
I was trying to do some sentiment analysis. I took your example and tried to switch the models, simply from
hfapigo.RecommendedTextClassificationModel
tometa-llama/Meta-Llama-3-8B
, but the response is not returned (waiting for it indefinitely). I also tried to make it work with thellama
models on other examples, but no response. Could you add an example on How to use thellama
models via the HugFace API/Interface?