centre-for-humanities-computing / danish-foundation-models

A project for training foundational Danish language model
https://foundationmodels.dk
MIT License
68 stars 4 forks source link

Gopher filter fails when encounter empty docs #209

Closed TTTTao725 closed 8 months ago

TTTTao725 commented 10 months ago

https://github.com/allenai/dolma/blob/41ec1efb580457716fd7209de70974d66ba6f9fb/python/dolma/taggers/gopher.py#L103 When processing documents with no words

github-actions[bot] commented 9 months ago

This issue is stale because it has been open for 14 days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days.

KennethEnevoldsen commented 9 months ago

We are awaiting a response here: https://github.com/allenai/dolma/pull/98

peterbjorgensen commented 9 months ago

Your pull request has been merged and this issue is therefore closed with this pull request https://github.com/centre-for-humanities-computing/danish-foundation-models/pull/234