Closed alielfilali01 closed 3 months ago
Do you want us to wait for Alghafa 2 to merge this?
Do you want us to wait for Alghafa 2 to merge this?
Yes please @clefourrier , i will take some time before Saturday to add the new version of the benchmark
No hurries, take your time!
Hello @clefourrier , I believe this PR is ready to be merged
LGTM but you need to homogeneize your naming:
- Prompt names such as
boolq_function
will be unclear long term. For such functions, you could either useboolq_prompt_arabic
or justboolq_arabic
. (You need to specify the language since there is already aboolq
prompt function by default.)- You also need to homogeneize
Alghafa
, which exists with several different casings, and fit it to Python style casing. For the prompt fonction, I'd keep it asalghafa_prompt
oralghafa
, for the class,CustomAlGhafaTask
, and here for the name I'd keep it lower case[CustomAlGhafaTask(name=f"alghafa:{subset}", hf_subset=subset) for subset in ALGHAFA_SUBSETS]
Done ✅
@clefourrier I hope this answers to your comments, plz feel free to ping me if i missed anything (i have a tendency to forget 😅) Again thanks a lot for the efforts 🤗
Looks better thank you! Do you have some reference models and scores against which I could check the implementation? Or did you check it, and against which models? :)
Looks better thank you! Do you have some reference models and scores against which I could check the implementation? Or did you check it, and against which models? :)
Yes @clefourrier , I tested gpt2 using --max_samples=1
and everything was fine and I believe Hamza is on it to test on bigger models and push the results to the hub for further inspection. I'll update you as soon as i hear back from Hamza
Sounds good, feel free to ping me whenever :)
AlGhafa eval dataset is no longer available on Huggingface, any alternatives ?
AlGhafa eval dataset is no longer available on Huggingface, any alternatives ?
Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine
Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine
Hi, yesterday the datasets disappeared from the OALL Huggingface account, now i can see them, thanks
Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine
Hi, yesterday the datasets disappeared from the OALL Huggingface account, now i can see them, thanks
OOH I see, i had to make the datasets private for about 20 min yesterday cuz i was testing something, what a coincidence you checked it at the same time 😅 sorry for the inconvenience 🤗
AlGhafa benchmarking suite, consist of 11 dataset presented in this paper and hosted in this repo in the Hub