EarthLifeConsortium / DwC-Mapping

Crosswalk for mapping Neotoma & PaleoDB to DarwinCore Standards for export to GBIF.
MIT License
1 stars 0 forks source link

What is the `type` field? #1

Open SimonGoring opened 8 years ago

SimonGoring commented 8 years ago

The type field for the Occurrence is causing some questions. Do we use the controlled terminology based on the Dublin Core terminology as listed here or is Occurrence an accepted field? How do we record actual physical specimens?

http://tools.gbif.org/dwca-validator/extension.do?id=dwc:Occurrence

SimonGoring commented 8 years ago

On the Neotoma (e.g., @SimonGoring or @IceAgeEcologist) side we had elected to use "Dataset", part of the Dublin Core controlled vocabulary, but as @mmcclenn pointed out, it may be better suited to an "Occurrence" type.

@tucotuco, any chance you might weigh in on this?

tucotuco commented 8 years ago

Hi folks,

The Dublin Core type vocabulary applies strictly to the dcterms:type term. Occurrence is a Darwin Core class and not a valid value for that term. To be more specific than dcterms:type allows, Darwin Core has the basisOfRecord term (http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord), whose range includes the Darwin Core classes PreservedSpecimen, LivingSpecimen, FossilSpecimen, MaterialSample, MachineObservation, and HumanObservation. The range of the dcterms:type term include Event, Dataset, PhysicalObject, Sound, StillImage, MovingImage, and Text. For fossil specimens, then, we would use the combination of dcterms:type=PhysicalObject and dwc:basisOfRecord=FossilSpecimen. By contrast, if a biodiversity data record was something taken out of a manuscript (no physical specimen), then we would use dcterms:type=Text dwc:basisOfRecord=HumanObservation.

Does that help?

SimonGoring commented 8 years ago

Thanks @tucotuco, I think that makes sense. Makes it a bit more complicated for us though :)

I think we need to split the type and basis of record by datasettype then. I wonder what we should do about records in the PaleoDB that have come in from DeepDive, or that were entered from published records.

I assume most fossil occurrences come from physical specimens, as do our pollen records (for example), but some must come from only a text record without knowledge of the actual physical specimen. Is that correct @mmcclenn @cambro or @markuhen ?

So we'd need some sort of scheme to differentiate between:

dcterms:type=PhysicalObject and dwc:basisOfRecord=FossilSpecimen -vs- dcterms:type=Text and dwc:basisOfRecord=HumanObservation

Do I have the gist of this right?

tucotuco commented 8 years ago

To me, yes, that is the correct use of the type fields.

On Fri, Jul 29, 2016 at 12:22 AM, Simon notifications@github.com wrote:

Thanks @tucotuco https://github.com/tucotuco, I think that makes sense. Makes it a bit more complicated for us though :)

I think we need to split the type and basis of record by datasettype then. I wonder what we should do about records in the PaleoDB that have come in from DeepDive, or that were entered from published records.

I assume most fossil occurrences come from physical specimens, as do our pollen records (for example), but some must come from only a text record without knowledge of the actual physical specimen. Is that correct @mmcclenn https://github.com/mmcclenn @cambro https://github.com/cambro or @markuhen https://github.com/markuhen ?

So we'd need some sort of scheme to differentiate between:

dcterms:type=PhysicalObject and dwc:basisOfRecord=FossilSpecimen -vs- dcterms:type=Text and dwc:basisOfRecord=HumanObservation

Do I have the gist of this right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EarthLifeConsortium/DwC-Mapping/issues/1#issuecomment-236043048, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcP6xnaqvdjPAJSej53P09mZaQbWqlJks5qaSudgaJpZM4JS1NF .

markuhen commented 8 years ago

All,

Yes. For the most part, all PBDB records are ultimately based on a physical specimen. I could imagine that some could be characterized as observations, but very few. For instance, I'm thinking of dinosaur footprints that are left in the field. They are still based on a specimen though, even though it wasn't collected and placed in a repository.

Thanks,

Mark

On 7/28/16 6:22 PM, Simon wrote:

Thanks @tucotuco https://github.com/tucotuco, I think that makes sense. Makes it a bit more complicated for us though :)

I think we need to split the type and basis of record by datasettype then. I wonder what we should do about records in the PaleoDB that have come in from DeepDive, or that were entered from published records.

I assume most fossil occurrences come from physical specimens, as do our pollen records (for example), but some must come from only a text record without knowledge of the actual physical specimen. Is that correct @mmcclenn https://github.com/mmcclenn @cambro https://github.com/cambro or @markuhen https://github.com/markuhen ?

So we'd need some sort of scheme to differentiate between:

dcterms:type=PhysicalObject and dwc:basisOfRecord=FossilSpecimen -vs- dcterms:type=Text and dwc:basisOfRecord=HumanObservation

Do I have the gist of this right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EarthLifeConsortium/DwC-Mapping/issues/1#issuecomment-236043048, or mute the thread https://github.com/notifications/unsubscribe-auth/ASpsTC7CzYRXdr2BBB_R6TKuxcO8cjx4ks5qaSudgaJpZM4JS1NF.

Mark D. Uhen Assistant Professor George Mason University AOES Geology MSN 6E2 Fairfax, VA 22030 Phone: 703-993-5264 Fax: 703-993-3535

SimonGoring commented 8 years ago

I agree with that, I'm just wondering if we need to make a distinction between records that were entered where we know the specimen, or its general location (museum, or whatever) vs. something that was pulled in programmatically.

markuhen commented 8 years ago

Simon,

Yes. I think we need to distinguish between these two types of records. There are LOTS of reasons to want to know the difference, and it's the reasons I can't think of that make me want to keep track of the different types!

Thanks much,

Mark

On 8/1/16 11:27 AM, Simon wrote:

I agree with that, I'm just wondering if we need to make a distinction between records that were entered where we know the specimen, or its general location (museum, or whatever) vs. something that was pulled in programmatically.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EarthLifeConsortium/DwC-Mapping/issues/1#issuecomment-236614467, or mute the thread https://github.com/notifications/unsubscribe-auth/ASpsTJWnBtugerKdrzh1exRCRNVOZjsWks5qbhBPgaJpZM4JS1NF.

Mark D. Uhen Assistant Professor George Mason University AOES Geology MSN 6E2 Fairfax, VA 22030 Phone: 703-993-5264 Fax: 703-993-3535