bigscience-workshop / metadata

Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.
Apache License 2.0
30 stars 12 forks source link

add new configs for entity_paragraph #157

Closed manandey closed 2 years ago

manandey commented 2 years ago

Hi @SaulLu, I have added the configs for entity_paragraph in this PR and made a few other minor updates.

cccntu commented 2 years ago

Hi @manandey , Thanks for the PR. I think we can leave this open until we decide what config we want.