pr-omethe-us / PyKED

Python interface to the ChemKED database format
https://pr-omethe-us.github.io/PyKED/
BSD 3-Clause "New" or "Revised" License
15 stars 15 forks source link

How to handle references without DOIs? #69

Open kyleniemeyer opened 7 years ago

kyleniemeyer commented 7 years ago

Although most recent data should be associated with an article with a DOI, some older datasets might instead come from government technical reports that may not have a DOI.

For example, the 1986 report from Burcat, Snyder, and Brabbs on benzene and toluene autoignition (https://ntrs.nasa.gov/search.jsp?R=19860015959) is not a journal article and does not have a DOI, but I think we should still support data given from reports like this.

(I'll note that some government agency technical reports do now have DOIs. For example, the Chemkin-III report is available via https://doi.org/10.2172/481621)

bryanwweber commented 7 years ago

Related: https://github.com/pr-omethe-us/ChemKED-database/issues/4 and #55

bryanwweber commented 7 years ago

We could add a URL field and whitelist URLs that tend to be stable, like NTRS. Of course, the concern with any URL type identifier is that it isn't persistent. Therefore, I think we'd want to reject data from sites that aren't on the whitelist (no lab group websites, etc.). You're right, though, that a lot of useful data might be only available at a URL. Also, the URL field should be exclusive of DOI (i.e., one or the other), and DOI should be mandatory where available.

kyleniemeyer commented 7 years ago

This seems reasonable. There absolutely should be either DOI or URL, though with the latter I don't think we can do any automated validation.

We can test that the URL resolves, but should we hard-code specific domains that are acceptable? Or should that just be when reviewing submissions to the database?

bryanwweber commented 7 years ago

Maybe let's start with just reviewing the URL on submission, and if it becomes a problem, hard-coding? Either way, we need to list the acceptable URLs somewhere public

kyleniemeyer commented 7 years ago

OK, so for ChemKED/PyKED, we can add a URL field that is exclusive to DOI; it should retain authors, year, and detail, but probably add title. The other fields we currently include (pages, volume, journal) are not relevant.

bryanwweber commented 7 years ago

Before we add the title, let's make sure it won't be a bear to validate if it's provided with DOI. I don't think title is necessarily required even for URL references, although it might help a lot.