lexibank / uralex

UraLex basic vocabulary dataset
Creative Commons Attribution 4.0 International
3 stars 5 forks source link

release candidate for UraLex 2.0 #9

Closed xrotwang closed 3 years ago

xrotwang commented 3 years ago

The CLDF data now comes with a human readable description, see https://github.com/lexibank/uralex/blob/release-2.0/cldf/README.md

Please also check the metadata at https://github.com/lexibank/uralex/blob/release-2.0/README.md

xrotwang commented 3 years ago

@kasyrj regarding the cognate sets: I just excluded cogn_set value 0 from cognate sets. That's the value in cogn_set when the form is [No equivalent]. So I'd consider this a bugfix.

xrotwang commented 3 years ago

@kasyrj I adapted the metadata such that it is clear(er) that the releases of this repos on Zenodo are the primary source of the dataset - not derived from a dataset somewhere else. With the changes in .zenodo.json, the correct metadata should show up as "recommended citation" on Zenodo.

xrotwang commented 3 years ago

@kasyrj it's also nice to see, that CI failed after changing the cognate set count: https://github.com/lexibank/uralex/actions

kasyrj commented 3 years ago

One additional potential issue: should Missing_UraLex2.0_refs.bib be merged with Borrowing_references.bib?

xrotwang commented 3 years ago

@kasyrj the bib files could be merged in raw/ already, yes. We didn't do that right away, because I had already made local changes before Mervi added refs in her branch. OTOH, cldf/sources.bib is the merge of all three bib files (minus whatever entries that are not actually cited). And since only the data in cldf/ is guaranteed to have matching references and bibkey, I wouldn't bother making the raw/ more consistent, when the end user should go for the CLDF data anyway.

xrotwang commented 3 years ago

@MervideHeer I incorporated your changes into this PR - so will close the other two relating to the bib files.

MervideHeer commented 3 years ago

@MervideHeer I incorporated your changes into this PR - so will close the other two relating to the bib files.

Understood. Thank you!

evoling commented 3 years ago

@xrotwang Although I've been using git for years I haven't been part of big collaborations before: how should I go about reviewing this? Do I just go through the "files changed" tab and confirm that they look okay to me? Are all these subsequent changes showing up there, and should I wait for them to be finished, or will we keep tidying things up even after the 2.0 branch is merged?

xrotwang commented 3 years ago

@evoling In terms of the data, the PR should be complete. It probably doesn't make much sense to review the data files - that's partly what cldf validate and the tests run by github actions are for. So focusing on the human-readable stuff, READMEs, other metadata should be good enough. If anything is changed from here on, this would show up as commits under this PR and would probably be triggered by discussion here; so you would probably be aware of this.

xrotwang commented 3 years ago

@evoling Oh, and yes, "Files changed" will show the accumulated changes of all commits being part of the PR. If you click on the link for just one commit, you'll see only this change - and a "x clear filters" button at the top, which lets you go back to the list of all changes.

xrotwang commented 3 years ago

@evoling I think we should encourage people to cite the dataset, too. I know I have been frustrated when following the link to a "dataset paper" and not finding the actual data there. We also should encourage people to cite exact versions of data they were using. This introduces a bit of a conundrum, though: The DOI to be assigned by Zenodo is not known when we release the repository. That's why we now have this wording: https://github.com/lexibank/uralex/blob/release-2.0/README.md#how-to-cite Hoping people will visit https://github.com/lexibank/uralex/releases to find

xrotwang commented 3 years ago

@MervideHeer see https://github.com/lexibank/uralex/pull/9#pullrequestreview-663588823 Should the wording be changed to be less of a "note-to-self"?

MervideHeer commented 3 years ago

@MervideHeer see #9 (review) Should the wording be changed to be less of a "note-to-self"?

@evoling

Sorry for the style problem. I think it's good to include the "maximal" citing information. I have now modified the sentence so that it's more directed to the reader. Unfortunately, I can't compile an example full reference just yet because we don't have all pieces of information so we need to live with draft-like instructions.

xrotwang commented 3 years ago

So it looks like this is good to go?

MervideHeer commented 3 years ago

So it looks like this is good to go?

I looked through our file changes and I see we have made all the changes we wanted. We are:

Previously

I cannot think of anything else to do anymore and I have crossed over all things from my checklist I summarize above. I'm finished and we are ready.

xrotwang commented 3 years ago

Perfect. I'll let this sit until tomorrow - off to catch a train now. Tomorrow I can make the release, and provide you with the new DOI.

MervideHeer commented 3 years ago

Perfect. I'll let this sit until tomorrow - off to catch a train now. Tomorrow I can make the release, and provide you with the new DOI.

Sounds good! Just on time so I can add a brand new publication with a DOI into a conference presentation. Thank you for the help and patience. Have a safe trip!