OpenThaiGPT / openthaigpt-pretraining

Apache License 2.0
21 stars 10 forks source link

[SIIT] Clean 3G pantip Data #177

Open ArthurMinovsky opened 1 year ago

ArthurMinovsky commented 1 year ago

Description

Analyze Pantip 3G dataset and see if there are any

Ref Near Duplicate:

Outcomes: