EveripediaNetwork / issues

Issues repo
9 stars 0 forks source link

Preprocess the ICP content #2681

Closed Softdev1 closed 1 month ago

Softdev1 commented 1 month ago

Description

The raw ICP content is filled with unnecessary tags and keywords which increase the number of tokens for the model input which is not cost efficient and also the titles are absurd which needs to be fixed before indexing