radareorg / r2ai

local language model for radare2
https://www.radare.org
MIT License
76 stars 15 forks source link

Add support for HuggingFace 🤗 inference API #65

Closed brainstorm closed 4 days ago

brainstorm commented 4 days ago

Checklist

Description

I have been playing with pseudo-C (pdc) decompilation for a STM8 codebase on the HuggingFace web-based Chat, for free:

Screenshot 2024-10-02 at 9 07 17 PM

Unfortunately, when hitting the API endpoint with the same model (or different ones) I'm hitting this:

[0x0000807f]> s 0x0000833c
[0x0000833c]> decai -d
{"error":"Model requires a Pro subscription; check out hf.co/pricing to learn more. Make sure to include your HF token in your query."}

On the latter I'm using the same Bearer token I'm using for the chat, so I'm not entirely sure what's going on? Free tier limits are a bit vague anyway, according to this forum thread: https://discuss.huggingface.co/t/api-limits-on-free-inference-api/57711/5

Also getting strange (capacity?) errors for smaller models:

[0x0000833c]> decai -d
{"error":"Model meta-llama/Llama-3.2-1B-Instruct is currently loading","estimated_time":98.86515045166016}

In any case, I hope this addition helps folks that do pay for this service?

brainstorm commented 4 days ago

Or perhaps there's a way to somehow hit the HuggingFace Chat API (https://huggingface.co/docs/text-generation-inference ??) instead of the arguably more official Serverless Inference API? 🤔

trufae commented 4 days ago

looks good to me! the huggingface apiname is a bit long. so i would suggest to use "hf" instead as an alias,. and yeah i guess its possible to hit this endpoint without any apikey.. but not sure if we want to play dirty with them :D