dice-group / WHALE

0 stars 0 forks source link

Clean and materialize domain specific dataset using sed #11

Open sshivam95 opened 1 week ago

sshivam95 commented 1 week ago

Next step to #9

sshivam95 commented 1 week ago

Update: only materializing is enough for this, the dice-embeddings library already take care of all the preprocessing part.

sshivam95 commented 1 week ago

Datasets materialized: