cldf-datasets / doreco

CLDF dataset derived from DoReCo's core corpus
https://doreco.info/
3 stars 0 forks source link

Cite individual corpora #20

Closed xrotwang closed 1 year ago

xrotwang commented 1 year ago

closes #18

FredericBlum commented 12 months ago

I am confused about this. We have merged this, but the README on the GitHub page is still unchanged. Since this is where most people will probably look at, shouldn't it appear there as well? I thought I had seen it actually, but it does not seem to be featured right now.

xrotwang commented 12 months ago

I haven't added/updated any data-related stuff yet, because we worked with the set including the NC licensed corpora most of the time. For the doreco CLDF release we'd exclude these, so the README (and data) would change to reflect that.

So, after you extract the exact dataset for the study, we could recreate the CLDF data with the smaller set of corpora and then release it with the README reflecting this reduced set.

xrotwang commented 12 months ago

If you want the citations of all corpora for the study, you could just run cldfbench readme cldfbench_doreco.py locally and then inspect the README.

xrotwang commented 12 months ago

If you want the citations of all corpora for the study, you could just run cldfbench readme cldfbench_doreco.py locally and then inspect the README.

It might be worth adding this to the usage notes to make sure people who use the NC corpora as well will also not forget to cite them.