Closed johnvanbreda closed 6 years ago
Reply from Martin:
Firstly we need to ensure that particular datasets can be assigned to the correct Data Partner (BRC or to recording schemes appropriate). As long as that is done we can probably be more relaxed about the dataset name, which could default to “Records collated via iRecord” or “Records collated via iRecord at BRC”. Unless we feel that the full survey name (e.g. “YNU | Terrestrial”) should be retained in order to provide recognition to partners who use Indicia?
I think that datasetName needs to be the name of the dataset as it is on the Atlas.
In the current test export format, datasetName is being populated with the survey name (e.g. “YNU | Terrestrial”). If NBN require this field to contain the dataset name as displayed on the Atlas, then this requires:
The survey name within iRecord does have value in acknowledging the source of the record entry point, which many have come from one of multiple websites for example. But I can't find anything in the DwC termlist that is a good match for this.
The oddly named collectionCode field might be usable for the iRecord survey name unless you're already using that field for something else (may collide with NE project codes). http://rs.tdwg.org/dwc/terms/#collectionCode
the NBN's guidance notes list it as usable for:
The name, acronym or code identifying the collection or data set from which the record was derived.
As things stand we are proposing to use collectionCode for the NE project codes.
The difficult with iRecord's survey name is that it sometimes indicates a separate source website or project, and sometimes just a fairly trivial distinction of data structure within a single website or project. At the moment I feel it would be a nice-to-have feature rather than a critical one.
The Atlas requires the datasetName to be the name of the data set on the Atlas.
We use collectionCode for the SurveyKey from Recorder6 and MarineRecorder.
Although I agree with Martin's concern about the survey name being somewhat arbitrary, wouldn't the same apply to collectionCode as extracted from Recorder 6? Survey subdivisions are very much arbitrary. Using collectionCode for the iRecord survey dataset name (that we are currently outputting in datasetName) might be more widely useful than using this field specifically for NE Project codes and would be more or less synonymous with the way that Recorder6 and Marine Recorder are doing this. Could the NE project codes be output as a dynamicProperty perhaps?
I'm not sure the suggested use of dynamicProperties really fits the NE codes: "A list of additional measurements, facts, characteristics, or assertions about the record. Meant to provide a mechanism for structured content." The NE codes are semi-structured, but will include an option for there being no code available, and are not really facts or assertions about the record as such.
How about datasetID - are we using that for anything else? http://rs.tdwg.org/dwc/terms/datasetID
The NE project codes seem to me to match this term's definition: "An identifier for the set of data. May be a global unique identifier or an identifier specific to a collection or institution."
At the moment the datasetID is the original Gateway dataset ID, probably we don't need this anymore. It could work for the NE project codes.
Done in develop branch:
Part of #304.
Would this be just the website title (e.g. iRecord) or include the survey dataset (iRecord General records)?