OpenThaiGPT / openthaigpt-pretraining

Apache License 2.0
21 stars 10 forks source link

fix(data): data dedup and decontaminate not working #237

Closed new5558 closed 1 year ago

new5558 commented 1 year ago

Why this PR

Why we need this PR?

Changes

Related Issues

Close #

Checklist