wietsedv / bertje

BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s layers? A closer look at the NLP pipeline in monolingual and multilingual models"
https://aclanthology.org/2020.findings-emnlp.389/
Apache License 2.0
133 stars 10 forks source link

A question about the training data: did you use DBNL? #37

Open vera-pro opened 1 year ago

vera-pro commented 1 year ago

Hi! Could you please tell me if the line "Books: a collection of contemporary and historical fiction novels (4.4GB)" in the paper (section 2.1: Data) refers to DBNL or to some other dataset? Hartelijk dank! :)