BAAI-DCAI / Bunny

A family of lightweight multimodal models.
Apache License 2.0
865 stars 65 forks source link

处理数据脚本 #18

Closed zapqqqwe closed 5 months ago

zapqqqwe commented 6 months ago

请问筛选高质量数据的多阶段过滤数据的脚本有开源吗

Isaachhh commented 6 months ago

https://github.com/BAAI-DCAI/Dataset-Pruning/tree/main/LAION

zapqqqwe commented 5 months ago

好的,谢谢