autotyp / autotyp-data

AUTOTYP data export
Creative Commons Attribution 4.0 International
38 stars 20 forks source link

Various metadata-related fixes and improvements. #42

Closed tzakharko closed 2 years ago

tzakharko commented 2 years ago
tzakharko commented 2 years ago

Updated the glottocode for Ahom and Eastern Armenian in the database, will be reflected on the next data export. The only language with a missing glottocode is now Serbian Torlak, which I could not locate in Glottolog (there is a note mentioning this variety under standard Serbian though)

xrotwang commented 2 years ago

@tzakharko see https://github.com/glottolog/glottolog/issues/815#issuecomment-1048583147

tzakharko commented 2 years ago

@tzakharko see glottolog/glottolog#815 (comment)

Thanks, I forwarded this to the team, will update the glottocode once a decision is made.

tzakharko commented 2 years ago

@xrotwang All the issues you have found so far should be fixed now. If you don't see any other low-hanging issues, I would like to publish a bugfix release based on the contents of this branch.

xrotwang commented 2 years ago

It would be nice, if the issue(s) with the bibliography could be addressed, too. That would save me a couple of lines cleaning it up. Otherwise, yes, all issues addressed.

tzakharko commented 2 years ago

It would be nice, if the issue(s) with the bibliography could be addressed, too. That would save me a couple of lines cleaning it up.

Yes, of course! Adopted your version of the bibliography in latest commit.

xrotwang commented 2 years ago

So the bibliography isn't pulled out of AUTOTYP but only exists as this BibTeX file anyway?

tzakharko commented 2 years ago

It is pulled from the database, but that particular part of the database is currently on life support and is not actively updated. This is another part that will require an overhaul, as some sources are "hidden" in the comments or special fields of various data file and are not currently exported. It is a lot of legacy to deal with..

For now your curated file works great, and as we continue modernising the database structure we will put a new bibliogaphy-generating mechanism in place. Making use of the excellent Glottolog reference database sounds like an obvious choice here.