NVIDIA / NeMo-Curator

Scalable data pre processing and curation toolkit for LLMs
Apache License 2.0
478 stars 57 forks source link

Fix lang id example #37

Closed ryantwolf closed 4 months ago

ryantwolf commented 5 months ago

Addresses #33

ryantwolf commented 5 months ago

For clarity, I verified that the unit tests I added for emulating the language identification filter do fail without this fix.