NVIDIA / NeMo-Curator

Scalable data pre processing and curation toolkit for LLMs
Apache License 2.0
477 stars 57 forks source link

Hardcode labels for domain and quality classifiers #95

Closed sarahyurick closed 3 months ago

sarahyurick commented 4 months ago

Closes https://github.com/NVIDIA/NeMo-Curator/issues/71