OpenThaiGPT / openthaigpt-pretraining

Apache License 2.0
21 stars 10 forks source link

Add pantip clean website to readable data #250

Open phasinA1learn opened 1 year ago

phasinA1learn commented 1 year ago

Why this PR

Why we need this PR? This PR is for cleaning html tag, website and error message.

Changes

Related Issues

Close #

Checklist

codecov[bot] commented 1 year ago

Codecov Report

Patch and project coverage have no change.

Comparison is base (c441682) 94.95% compared to head (6b95346) 94.95%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #250 +/- ## ======================================= Coverage 94.95% 94.95% ======================================= Files 12 12 Lines 337 337 ======================================= Hits 320 320 Misses 17 17 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `94.95% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=OpenThaiGPT#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.