cldf-datasets / uratyp

Creative Commons Attribution 4.0 International
5 stars 4 forks source link

add sources #6

Closed xrotwang closed 2 years ago

xrotwang commented 2 years ago

from https://github.com/bedlan/uratyp/blob/master/raw/UT_sources.bib

JakeJing commented 2 years ago

@xrotwang I just added the UT_sources.bib in the repos.

xrotwang commented 2 years ago

Are these sources somehow related to the coded values? I don't see any reference in the language tables. Maybe we can at least note which languages are treated in which source? Otherwise it wouldn't provide more service than the zotero group library - right?

JakeJing commented 2 years ago

@MiinaNo I am not quite sure whether you can add the corresponding languages for each reference, by creating an additional field called "language" in the bib file. @xrotwang is this what you need?

xrotwang commented 2 years ago

Ideally, individual datapoints would reference specific sources. Second best would be what you describe, but the other way round: Adding a column "sources" here https://github.com/cldf-datasets/uratyp/blob/main/raw/Languages.csv containing comma-separated bibtex keys.

JakeJing commented 2 years ago

Ok, I see. @MiinaNo I remembered that this is not the complete list of references. Right? Do you have some other bib files or tables that can point to each feature values?

MiinaNo commented 2 years ago

@JakeJing, @xrotwang I do not have any other file. Only for myself I have noted down what sources where used when filling out a table for a particular language. I would be fine by creating an additional field called "language" in the bib file or doing whatever what is needed, I let you to decide what would be the best. But if you could give me one example how do you want things to be formatted in the table that would be great. And yes, I need to add some sources at some point. There are a few tables that are somewhat incomplete in terms of examples (e.g. Kamas, Selkup).

JakeJing commented 2 years ago

@MiinaNo I added two bib examples here by creating another field ("langref") in the bib file. Pls use the language name in our language table. If one entry is used for multiple languages, you can separate them by comma, like Finnish, Estonian.

@book{winkler_udmurt_2001, address = {M{\"u}nchen; Newcastle}, author = {Winkler, Eberhard}, date-added = {2021-10-21 22:57:52 +0300}, date-modified = {2021-10-23 22:01:15 +0200}, langref = {Udmurt}, number = {212}, publisher = {Lincom Europa}, series = {Languages of the {World}/{Materials}}, title = {Udmurt}, year = {2001}}

@book{nikolaeva_grammar_2014, address = {Berlin, Boston}, author = {Nikolaeva, Irina}, date-modified = {2021-10-23 19:35:33 +0200}, doi = {10.1515/9783110320640}, isbn = {978-3-11-037329-5}, langref = {Tundra_Nenets}, publisher = {De Gruyter Mouton}, title = {A {Grammar} of {Tundra} {Nenets}}, url = {https://www.degruyter.com/view/product/208261}, urldate = {2018-10-30}, year = {2014}, bdsk-url-1 = {https://www.degruyter.com/view/product/208261}, bdsk-url-2 = {https://doi.org/10.1515/9783110320640}}

BTW, there are some years missing in the bib files. If you can add them in the bib file, I can produce the table required by Robert. Thanks!

MiinaNo commented 2 years ago

Done! @JakeJing , thanks for the instructions. I also added the years in case they were missing. One thing I started thinking about is that do abbreviations cause troubles? E.g.

PLS = Priäžan Lyydilaižet sananpolved. 2012. Petroskoi: Periodika. MED = Riese, Timothy, Jeremy Bradley & Elina Guseva. 2014–. Mari-English Dictionary. Vienna: University of Vienna. dict.mari-language.com (18 November, 2020).

I had all the abbreviations in Zotero, but if some extra formatting is needed, let me know.

For some languages I still need to add sources, e.g. Kazym Khanty, South Selkup.

JakeJing commented 2 years ago

@MiinaNo I fixed some missing years, author info and duplicated keys, and there are still some entries without authors in the bib files. It would be better to add the authors' info, like _1955, noauthor-_2012, grunberg_vadja_2013, noauthor_sikor_2018, _oi_1998, noauthor_ob-ugric_nodate, kujola_lyydilaismurteiden_1944, ojansuu_lyydilaisia_1934, torikka_karjalan_2009, damberg_jemakilugdobrantoz_1935, nirvi_inkeroismurteiden_1971, noauthor_ada_2000.

I also produced a new summary table of languages with an additional column of citations. You can easily identify those languages that do not have any citation.

If you can fix these issues, I can update the table and pass it to Robert.

For the abbreviations, it is always better to expand them to make it more transparent, if it won't take that much time.

xrotwang commented 2 years ago

@JakeJing @MiinaNo thanks, that looks useful!

MiinaNo commented 2 years ago

@JakeJing , @xrotwang I have fixed already most of the problems which Yingqi pointed out, but before merging I want to add some missing sources.