NVIDIA / NeMo-Curator

Scalable toolkit for data curation
Apache License 2.0
327 stars 32 forks source link

fuzzy dedup in cpu #101

Open simplew2011 opened 3 weeks ago

simplew2011 commented 3 weeks ago

can you release a cpu veriosn in fuzzy dedup