hurlbertlab / dietdatabase

Creative Commons Zero v1.0 Universal
10 stars 9 forks source link

automated peer-review, travis errors and question #95

Closed jhpoelen closed 6 years ago

jhpoelen commented 6 years ago

Was just checking the status of your wonderful dataset at https://globalbioticinteractions.org/status and noticed that it was red!

After some digging, I found that GloBI picked up a version of your diet database where all the fields were enclosed with double quotes. In tsv files, no quotes should be needed. The error has since been corrected but travis did pick-up the error starting at https://travis-ci.org/hurlbertlab/dietdatabase/builds/308091124 . Unfortunately, the issue was not fixed until about 2 days later which caused the error to be propagated to GloBI. (See screenshot below)

Note in the image below that the travis is green (issue has been fixed), but globi is red, because it picked up an older version about three days ago. screenshot from 2017-12-01 15-05-56

Some questions - would it be fair to say that the travis error messages are too cryptic to be actionable? How do you think we can introduce an immediate peer-review mechanism that would actually be more useful? Should GloBI only consider versions of the Avian Diet Database that pass the travis check?

from https://travis-ci.org/hurlbertlab/dietdatabase/builds - screenshot from 2017-12-01 14-58-08

ahhurlbert commented 6 years ago

While happy to include the travis build checks for GloBI's sake, I don't get messages when a build fails and I don't go seeking this info. Perhaps that will change, but it's just not part of our workflow.

So it makes sense to me that you would only want to consider versions that pass the travis check.

I'm also hoping to finally release an official version with DOI this semester, although I seem to have been saying that for a year or so...

jhpoelen commented 6 years ago

If you are interested in enabling email notifications to catch data integrity issues (e.g., tab->commas) early please see https://docs.travis-ci.com/user/notifications/#Configuring-email-notifications . I've also created a pull request with an example https://github.com/hurlbertlab/dietdatabase/pull/97 .

However, I do understand that you might not be super excited to have yet another thing to worry about. I use the method to keep track of dataset availability and integrity without having to check manually once in a while, but hey, there's many ways to skin a cat.

ahhurlbert commented 6 years ago

Thanks for the example. I've added my email for notifications. I appreciate the tech sophistication you've added to the project over the years!

jhpoelen commented 6 years ago

Hopefully, this will lead to even more beautiful and integrated datasets. . . please do tell your colleagues about this. I envision a future automated, continuous peer-review process to track the "integrated-ness" of datasets in a distributed biodiversity web of knowledge: before, during and after (!) publication.

jhpoelen commented 6 years ago

Please see posts https://www.globalbioticinteractions.org/2017/11/21/catalyzing-data-exchange/ and https://www.globalbioticinteractions.org/2017/01/24/lifestages-of-species-interaction-datasets/ for some more background.