modelscope / data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
Apache License 2.0
2.63k stars 166 forks source link

Add turbo mode #402

Closed drcege closed 4 weeks ago

drcege commented 1 month ago

Add turbo mode to allow disabling fault tolerance and maximize processing speed; future optimizations in batch processing should aim to address the performance gap until the option becomes impactless.