dataset - Githubissues

flowaicom / flow-judge

Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafted for accuracy, speed, and customization.

Apache License 2.0

50 stars 7 forks source link

dataset #23

Open darkacorn opened 2 days ago

darkacorn commented 2 days ago

no matter how much i love phi 3.5 and and i am indeed very great-full you guys released the excerpt datasets

would there be a way to get the full dataset under academic licence ?

im trying to train up a 8b model either with llama3.1 3b llama 3.2 or the new 8b of mistral

and if not - the synthetic pipeline scripts would help already heaps

alexandreteles commented 2 days ago

Having the dataset available would be good, I've been wondering the same thing for the last two weeks.

Phi 3.5 was likely picked for its supposed strong reasoning capabilities, but it is a very underwhelming model. I would love to see a llama 3.2 3b finetune based on the Flow Judge dataset and I've also thought about asking for either the dataset or the synthdata pipeline scripts.

Thank you!