terryyz / ice-score

[EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code
https://arxiv.org/abs/2304.14317
MIT License
69 stars 8 forks source link

Evaluate open source models with PEFT adapters #2

Open gvijqb opened 1 year ago

gvijqb commented 1 year ago

I need a way to evaluate a model like this: https://huggingface.co/qblocks/falcon-7b-python-code-instructions-18k-alpaca

This is a finetuned model on base open source model falcon-7b for code generation. The output is a adapter file using LORA. How can I do this with your tool?

terryyz commented 1 year ago

Hi,

I'm working (slowly) on a new version on the dev branch, which supports the open model inference. I previously tested with LLaMA only and haven't had time to compute the results thoroughly.

Cheers, Terry

gvijqb commented 1 year ago

Hi @terryyz

Thanks for the update. I am looking forward to it. Please keep me posted once you have an update.

At MonsterAPI we have developed a no-code LLM finetuner and are exploring different ways to do quick evaluation on finetuned adapters.

Thanks, Gaurav

terryyz commented 1 year ago

Hi @gvijqb,

No problem!

Please let me know if you'd like to collaborate on this project and beyond :)

Cheers, Terry

gvijqb commented 1 year ago

Sure, would love to explore.

Please share how we can collaborate?

terryyz commented 1 year ago

Not sure if MonsterAPI may support some computational resources. I'm a bit short of good GPUs these days 😞