Closed peterdesmet closed 6 years ago
Could we use the term dcterms:references for these items? The definition is "A related resource that is referenced, cited, or otherwise pointed to by the described resource."
There is no need for that + it's a bit a misuse of that field:references
is supposed to be that taxon, but described in more detail. A good example for occurrences would be the URL of an iNaturalist observation for an occurrence record of that observation.
source
. There we opted to only include a URL (not the full reference), concatenated with |
. But we could opt to add the full reference if you want.See for example how we did it for RINSE pathways: https://github.com/trias-project/rinse-pathways-checklist/tree/master/data/processed
The question here is, if we continue to do 2 (add URL only), that it would probably be easier if the URL was extracted from the full reference in the spreadsheet using a script, rather than maintain a full reference (source
) and URL file (identifier
). Let me know how you see this.
I'm not clear on all the repercussions of the options. For the sake of consistency I prefer the solution in the RINSE dataset, but I'm not really following you on extracting from the full reference. Couldn't the full reference be be to the observation and not the description of the distribution?
there are indeed more than a few records for which the source is effectively a URL to an observation
I have removed the field identifier
as almost everywhere the URL was already included (often as part of a full reference) in the field source
. Where that was not the case (often because DOI was not written as a URL), I have updated the information in source
.
@timadriaens @qgroom please write any URL in the field source (including DOIs) with http://
(or https://
for DOIs)!
I'll update the script to handle this info.
I noticed there is an (undocumented)
identifier
field added to the spreadsheet, which seems to be the URL thesource
might contain, e.g. a link to a pdf or a DOI. The field is used for the source of the distribution.I think it would be better to drop the field and only keep
source
. That way only one field has to be maintained, given that you try to always write URLs (including for DOIs) in full and at the end. @qgroom @timadriaens what do you think?For the
source
of the distribution, we can then try to extract the URL using regex in the script.