magpie-align / magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
https://magpie-align.github.io/
MIT License
418 stars 43 forks source link

KeyError: 'uuid' during filtering #24

Closed rickyang1114 closed 1 month ago

rickyang1114 commented 1 month ago

Dear authors,

Thanks for your excellent work!

While attempting to execute filtered_dataset = dataset['train'].filter(high_quality_filter), I encountered a KeyError: 'uuid'. Upon inspecting the code in exp/gen_dis.py, it appears that the uuid value is not assigned. Could this be an oversight, or am I misunderstanding something?

fly-dust commented 1 month ago

Thank you point it out! My bad!

When fixing #20, I forgot to change uuid key too... It's now fixed!