The attribution of the sources to the records in the distribution extension differs from approach used in the other dwc files. In the case of the distribution extension, a reference (in input_literature_references.csv) is attributed to a specific field_name (column name in the raw data file) x id_sp_region (the id of a taxon in a particular region) combination in input_distribution:
id_sp_region
field_name
reference
2365
ecoimpact_id
aaa et al. (xxx)
2365
first_introduction
bbb et al. (yyyy)
(example from input_literature_references)
The field_names as given in input_literature_references are:
current_distrib
current_distribution
distribution
ecoimpact_id
ecological impact
first observation
general references
history_is_known
impact on uses
introduction dates
introduction history
status
useimpact_id
The thing is, these field names do not correspond at all with the field names in input_distribution. Above that, some field_names are very similar, such as current_distrib and current_distribution.
This is what I suggest:
Extract references for current_distrib, current_distribution, ecoimpact_id, ecological impact, , first_observation, impact on uses,status and useimpact_id, and link these references with the respective piece of information in the description extension (all these terms are mapped in the description).
Use distribution, general_references, introduction dates and introduction_history as references in the distribution extension, as they refer to fields included in this extension. This would be a concatenated string using | as a separator.
@DavidRoy
The attribution of the sources to the records in the distribution extension differs from approach used in the other dwc files. In the case of the distribution extension, a reference (in
input_literature_references.csv
) is attributed to a specificfield_name
(column name in the raw data file) xid_sp_region
(the id of a taxon in a particular region) combination ininput_distribution
:(example from
input_literature_references
)The field_names as given in
input_literature_references
are:current_distrib
current_distribution
distribution
ecoimpact_id
ecological impact
first observation
general references
history_is_known
impact on uses
introduction dates
introduction history
status
useimpact_id
The thing is, these field names do not correspond at all with the field names in
input_distribution
. Above that, some field_names are very similar, such ascurrent_distrib
andcurrent_distribution
.This is what I suggest:
current_distrib
,current_distribution
,ecoimpact_id
,ecological impact
, ,first_observation
,impact on uses
,status
anduseimpact_id
, and link these references with the respective piece of information in the description extension (all these terms are mapped in the description).distribution
,general_references
,introduction dates
andintroduction_history
as references in the distribution extension, as they refer to fields included in this extension. This would be a concatenated string using|
as a separator. @DavidRoy