ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
61 stars 13 forks source link

Code Table Request - Stratigraphy adds for Burke Museum Vert Paleo #6541

Closed Jegelewicz closed 1 year ago

Jegelewicz commented 1 year ago

Goal

Describe what you're trying to accomplish. This is the only necessary step to start this process. The Committee is available to assist with all other steps. Please clearly indicate any uncertainty or desired guidance if you proceed beyond this step.

The Burke Paleo collections are hard at work getting their data cleaned up for bulkload into Arctos and are requesting additions to various stratigraphy code tables.

Context

Describe why this new value is necessary and existing values are not.

There are quite a few stratigraphy terms missing from our code tables that will help them get their data in faithfully.

Table

Code Tables are http://arctos.database.museum/info/ctDocumentation.cfm. Link to the specific table or value. This may involve multiple tables and will control datatype for Attributes. OtherID requests require BaseURL (and example) or explanation. Please ask for assistance if unsure.

ctlithostratigrapy_group ctlithostratigrapy_formation ctlithostratigrapy_member ctlithostratigrapy_informal ctbiostratigaphic_zone ctlithostratigrapy_bed

(Apologies if I haven't gotten these exactly right. Arctos is down today and I am making my best guess at the code table names...)

Proposed Value

Proposed new value. This should be clear and compatible with similar values in the relevant table and across Arctos.

See attached file

Proposed Definition

Clear, complete, non-collection-type-specific functional definition of the value. Avoid discipline-specific terminology if possible, include parenthetically if unavoidable.

See attached file

Collection type

_Some code tables contain collection-type-specific values. collection_cde may be found from https://arctos.database.museum/home.cfm_

N/A

Attribute Extras

Attribute data type

If the request is for an attribute, what values will be allowed? free-text, categorical, or number+units depending upon the attribute (TBA)

N/A

Attribute controlled values

If the values are categorical (to be controlled by a code table), add a link to the appropriate code table. If a new table or set of values is needed, please elaborate.

N/A

Attribute units

if numerical values should be accompanied by units, provide a link to the appropriate units table.

N/A

Priority

Please describe the urgency and/or choose a priority-label to the right. You should expect a response within two working days, and may utilize Arctos Contacts if you feel response is lacking.

Example Data

Requests with clarifying sample data are generally much easier to understand and prioritize. Please attach or link to any representative data, in any form or format, which might help clarify the request.

Available for Public View

Most data are by default publicly available. Describe any necessary access restrictions.

Yes

Helpful Actions

@ArctosDB/arctos-code-table-administrators

Approval

See Special Exemption 2

Implementation

Once all of the Approval Checklist is appropriately checked and there are no Rejection comments, or in special circumstances by decree of the Arctos Working Group, the change may be made.

Close this Issue.

DO NOT modify Arctos Authorities in any way before all points in this Issue have been fully addressed; data loss may result.

Special Exemptions

In very specific cases and by prior approval of The Committee, the approval process may be skipped, and implementation requirements may be slightly altered. Please note here if you are proceeding under one of these use cases.

  1. Adding an existing term to additional collection types may proceed immediately and without discussion, but doing so may also subject users to future cleanup efforts. If time allows, please review the term and definition as part of this step.
  2. The Committee may grant special access on particular tables to particular users. This should be exercised with great caution only after several smooth test cases, and generally limited to "taxonomy-like" data such as International Commission on Stratigraphy terminology.

@Nicole-Ridgwell-NMMNHS if you will review and approve these, I can do the adding. If you have questions or changes to suggest, please just post them here. Thank you in advance for taking this on! (and if you cannot - please let me know).

UWBM.PB.VP.Geology.xlsx

Nicole-Ridgwell-NMMNHS commented 1 year ago

Have these been reviewed for duplicates with the existing code table?

Do they have time to put together definitions? I like to include period, major lithology, and geographic area. If not, do we just go ahead and add them this way?

Most of what is listed in the zone tab should be and is already in biochron.

Jegelewicz commented 1 year ago

Have these been reviewed for duplicates with the existing code table?

I thought so, but perhaps I was mistaken! I will check them once Arctos is back and make a new file.

Do they have time to put together definitions? I like to include period, major lithology, and geographic area.

Once I have the de-duped list, I'll ask them to complete that.

Thanks!

Jegelewicz commented 1 year ago

@KatherineLAnderson see the discussion above.

I did a little matching to the existing code tables and did find a few that are already in Arctos. Those have been greyed out.

Next, is a request to add a bit to the descriptions where possible.

I like to include period, major lithology, and geographic area.

I've added columns in the sheets so that you can fill in blanks - these may not be pertinent for all of the strata types, but you can ignore it if it isn't. If just writing something in a single column is easier, feel free to do that.

Informal - this all looks like lithostratigraphy. We only have an informal chronostratigraphy code table. Do we need an informal litho table or can these just be entered as geology remarks?

Do you have time to work on this and make the descriptions more complete? @Nicole-Ridgwell-NMMNHS please let me know if you have anything to add.

Here is the updated workbook. UWBM.PB.VP.Geology_v2.xlsx

KatherineLAnderson commented 1 year ago

Dupes - I think I missed the biochrons code table when I pulled all the attribute tables from Arctos which explains why I didn't catch those duplicates. There are a lot of code tables (which is awesome).

Informal - It would ease data entry and searchability to keep these as controlled vocabulary with a designated place to enter/search for them. They are all used regularly in the literature. This is a similar issue to the floras and faunas question that I brought up in the migration thread. I think it might be worth having an informal field. @WaigePilson what do you think?

Descriptions - Do you need descriptions for all the lithostrat, or just the records not in GeoLex and/or Macrostrat? (I ask because a lot of that information is managed/updated in those databases.)

thanks!!

Jegelewicz commented 1 year ago

Informal

We will need to request a new code table, so let's iron out what this should be. I think you will want two different informal tables:

Informal lithostratigraphy Informal biostratigraphy (or should this just be "floras")

For each, we need to describe what is supposed to go in them as opposed to what is in any of the "formal" lithostratigraphy terms, biochrons, or bio zones so that appropriate decisions can be made when new term requests come along. @Nicole-Ridgwell-NMMNHS should probably be involved and I think we also need better descriptions for the following so that we can make decisions about what is "formal" and what is "informal". I think we have already failed at this with the description for Infromal Chronostratigraphy.

  1. What makes a term "informal chronostratigraphy"?
  2. When is a term NOT worthy of being in one of the "formal" lithostratigraphy code tables?
  3. When is a term worthy of being "informal chronostratigraphy" as opposed to "geology remarks?
Purpose Table Description
Locality Attribute: Biochronology ctbiochronology Controlled vocabulary for Biochrons associated with a locality.
Locality Attribute: Biostratigraphic Zones ctbiostratigraphic_zone Controlled vocabulary for biostratigraphic zones available as locality attributes.
Locality Attribute: Informal Chronostratigraphy ctchronostrat_informal Controlled vocabulary for informal chronostratigraphy
Locality Attribute: Lithodemic Suites ctlithodemic_suite Controlled vocabulary for lithodemic suites available as locality attributes.
Locality Attribute: Lithostratigraphic Beds ctlithostratigraphic_bed Controlled vocabulary for lithostratigraphic beds available as locality attributes.
Locality Attribute: Lithostratigraphic Formations ctlithostratigraphic_formation Controlled vocabulary for lithostratigraphic formations available as locality attributes.
Locality Attribute: Lithostratigraphic Groups ctlithostratigraphic_group Controlled vocabulary for lithostratigraphic groups available as locality attributes.
Locality Attribute: Lithostratigraphic Members ctlithostratigraphic_member Controlled vocabulary for lithostratigraphic members available as locality attributes.
dustymc commented 1 year ago

think we have already failed at this

I guess I don't REALLY have any opinions, but that seems right. I think there may be about three categories of "containers" here, whatever the intentions.

  1. Formal taxonomies - things blessed by some official-sounding org or something
  2. Descriptive stuff in expected places (eg whatever the collector said in verbatim locality)
  3. Stuff nobody ever finds because we've made some weird pigeonhole to hide it in
Nicole-Ridgwell-NMMNHS commented 1 year ago

The primary reason we have informal chronostratigraphy is because chronostratigraphy is strictly limited to the international chronostratigraphic chart. We should probably reflect that in the informal chronostratigraphy definition.

How about: Controlled vocabulary for all chronostratigraphy not included in the International Chronostratigraphic Chart.

For the lithostratigraphy it is a different situation because we're not working with one standard set of terms. Lithostratigraphy is much more like taxonomy, and there are rules for naming units as particular ranks: https://stratigraphy.org/guide/litho. I think if we decide to create an informal lithostratigraphy table, it could be defined as "unranked units used in published literature". I've been putting stuff like that in a remark, but it sounds like @KatherineLAnderson has good reason for controlled values. Looking at these 'informal lithostratigraphy' it seems like there are some in between member and bed (ex. Blacks Fork A) and some in between formation and member (ex. Upper Buckley Formation).

Nicole-Ridgwell-NMMNHS commented 1 year ago

Informal biostratigraphy (or should this just be "floras")

I think I missed this discussion.

Jegelewicz commented 1 year ago

I think I missed this discussion.

Sorry about that - see https://github.com/ArctosDB/data-migration/issues/1534#issuecomment-1654759702

Jegelewicz commented 1 year ago

@Nicole-Ridgwell-NMMNHS also, @KatherineLAnderson asked

Descriptions - Do you need descriptions for all the lithostrat, or just the records not in GeoLex and/or Macrostrat? (I ask because a lot of that information is managed/updated in those databases.)

What's your policy?

Nicole-Ridgwell-NMMNHS commented 1 year ago

I started including a description for everything added individually after we migrated, even if it has a Geolex/Macrostrat link BUT doing that in bulk would be very time consuming. Nothing added before that had a description and I definitely haven't had time to go back and add descriptions.

Jegelewicz commented 1 year ago

I definitely haven't had time to go back and add descriptions.

Sounds like a nice little project for a volunteer or an intern....

KatherineLAnderson commented 1 year ago

I started including a description for everything added individually after we migrated, even if it has a Geolex/Macrostrat link BUT doing that in bulk would be very time consuming. Nothing added before that had a description and I definitely haven't had time to go back and add descriptions.

If these descriptions could wait to be tackled later, we would be very appreciative!

KatherineLAnderson commented 1 year ago

Paige, Ron and I met this morning and the consensus was:

And in the Arctos Tea discussion, Nicole waived the need for descriptions for the new lithostrat and chronostrat units we asked to be added. And Teresa again mentioned this would be a great project for a future Arctos intern.

Nicole-Ridgwell-NMMNHS commented 1 year ago

Let's start a separate code table request for the informal lithostratigraphy table, but I approve of going ahead and adding everything else.

Jegelewicz commented 1 year ago

Groups have been added except

D1 Sequence - https://doi.org/10.2113/gsrocky.37.2.111

because of the name not including "Group". Just want to make sure that this makes sense here?

Jegelewicz commented 1 year ago

@KatherineLAnderson Members added except

member Reason
Parachute Creek  Member - https://ngmdb.usgs.gov/Geolex/Units/ParachuteCreek_9761.html There are two spaces between Creek and member in the file. Same as Parachute Creek Member [ link ]
Smoky Hill Chalk Member - https://ngmdb.usgs.gov/Geolex/Units/SmokyHill_10308.html same as Smokey Hill Chalk Member [ link ]?
Sonsela Member - https://en.wikipedia.org/wiki/Chinle_Formation same as Sonsela Sandstone Member [ link ]?

I need to add metadata after the formations are in.

Jegelewicz commented 1 year ago

In the attached file, these are listed as Biostratigraphic Zones, but I think they should be Biochrons?

@KatherineLAnderson @Nicole-Ridgwell-NMMNHS

Jegelewicz commented 1 year ago

Formation potential matches

formation Reason
Dakota Formation - https://ngmdb.usgs.gov/Geolex/Units/Dakota_7833.html Dakota Sandstone Formation [ link ]
Greenhorn Formation - https://ngmdb.usgs.gov/Geolex/Units/Greenhorn_8485.html Greenhorn Limestone Formation [ link ]
Moreno Formation - https://ngmdb.usgs.gov/Geolex/Units/Moreno_11379.html Moreno Hill Formation [ link ]
Posidonienschiefer Formation - https://macrostrat.org/sift/#/strat_name_concept/39006 Posidonia Shale Formation [ link ]
Santa Margarita Sandstone Formation - https://ngmdb.usgs.gov/Geolex/Units/SantaMargarita_11775.html Santa Margarita Formation [ link ]

Just checking in on these because they seem like potential duplicates - in the case of Posidonienschiefer Formation, they are.

Jegelewicz commented 1 year ago

Duplicates in the file

Term description
Manning Canyon Formation https://ngmdb.usgs.gov/Geolex/Units/ManningCanyon_5979.html
Manning Canyon Shale Formation https://ngmdb.usgs.gov/Geolex/Units/ManningCanyon_5979.html
Winthrop Formation https://ngmdb.usgs.gov/Geolex/Units/Winthrop_12294.html
Winthrop Sandstone Formation https://ngmdb.usgs.gov/Geolex/Units/Winthrop_12294.html

Which name should I use?

Jegelewicz commented 1 year ago

@Nicole-Ridgwell-NMMNHS do you have a template for loading the metadata?

KatherineLAnderson commented 1 year ago

Groups have been added except

D1 Sequence - https://doi.org/10.2113/gsrocky.37.2.111

because of the name not including "Group". Just want to make sure that this makes sense here?

I included it in Group because it's a broad category on the level of a Group, but you're right it's not a "Group". If we are creating an informal lithostrat code table then this could go there.

@KatherineLAnderson Members added except member Reason Parachute Creek Member - https://ngmdb.usgs.gov/Geolex/Units/ParachuteCreek_9761.html There are two spaces between Creek and member in the file. Same as Parachute Creek Member [ link ] Smoky Hill Chalk Member - https://ngmdb.usgs.gov/Geolex/Units/SmokyHill_10308.html same as Smokey Hill Chalk Member [ link ]? Sonsela Member - https://en.wikipedia.org/wiki/Chinle_Formation same as Sonsela Sandstone Member [ link ]?

I need to add metadata after the formations are in.

"Smoky Hill Chalk Member" is the correct spelling. The macrostrat link provided for "Smokey Hill Chalk Member" spells it Smoky. Sonsela Member is now the accepted terminology, per the literature (and NPS preference). Sonsela Sandstone Bed was previously used to refer to the same packet of rocks when it was considered part of the Petrified Forest Member.

In the attached file, these are listed as Biostratigraphic Zones, but I think they should be Biochrons?

@KatherineLAnderson @Nicole-Ridgwell-NMMNHS

Yes, that was an error on my end--I didn't export the biochron code table when I did my check.

KatherineLAnderson commented 1 year ago

Formation potential matches formation Reason Dakota Formation - https://ngmdb.usgs.gov/Geolex/Units/Dakota_7833.html Dakota Sandstone Formation [ link ] Greenhorn Formation - https://ngmdb.usgs.gov/Geolex/Units/Greenhorn_8485.html Greenhorn Limestone Formation [ link ] Moreno Formation - https://ngmdb.usgs.gov/Geolex/Units/Moreno_11379.html Moreno Hill Formation [ link ] Posidonienschiefer Formation - https://macrostrat.org/sift/#/strat_name_concept/39006 Posidonia Shale Formation [ link ] Santa Margarita Sandstone Formation - https://ngmdb.usgs.gov/Geolex/Units/SantaMargarita_11775.html Santa Margarita Formation [ link ]

Just checking in on these because they seem like potential duplicates - in the case of Posidonienschiefer Formation, they are.

Duplicates in the file Term description Manning Canyon Formation https://ngmdb.usgs.gov/Geolex/Units/ManningCanyon_5979.html Manning Canyon Shale Formation https://ngmdb.usgs.gov/Geolex/Units/ManningCanyon_5979.html Winthrop Formation https://ngmdb.usgs.gov/Geolex/Units/Winthrop_12294.html Winthrop Sandstone Formation https://ngmdb.usgs.gov/Geolex/Units/Winthrop_12294.html

Which name should I use?

This is getting at something that I have some confusion about. Perhaps @Nicole-Ridgwell-NMMNHS or @WaigePilson can also join in with their thoughts. What do we do when we have variations in the name of the same formation, all of which are listed in GeoLex and/or MacroStrat? I think the Greenhorn formation is a good example. On GeoLex, they list the Greenhorn formation, the Greenhorn Limestone formation, and the Greenhorn Shale formation (https://ngmdb.usgs.gov/Geolex/Units/Greenhorn_8485.html). Without doing a deep dive into the literature, it is unclear which is the accepted name (and that also might vary depending on what you read/who you ask). Greenhorn formation is the most general and presumably inclusive of both limestone and shale within, but then the other two with lithologies in the name are more specific and contain important information. Do we err on the side of being more general and add Greenhorn formation to the code table and relegate the other two to remarks, or do we err on the side of specificity and add all of the names considering they may refer to different things? Curious to know everyone's thoughts.

Nicole-Ridgwell-NMMNHS commented 1 year ago

variations in the name of the same formation

I think I usually use either 1) whatever is used in the most states or the most recent usage per Geolex or 2) whatever version my data uses. However it would probably be good to decide on some guideline for this. I'm inclined toward whatever is used in the most states per Geolex, but am also ok with just using the more general version when there is more than one in use. We could also just dispense with both lithologic designation and rank since rank is in the table/attribute name, although that might be a lot of work to update the table. But it is the solution both Geolex and Macrostrat use.

do we err on the side of specificity and add all of the names

I think this route would result in more confusion (someone doesn't realize there is more than one version in the table, adds whatever version is first). Variations could be added to remarks, although I think in a lot of collections data they're used too inconsistently to even begin to infer the formation boundaries the collector was using.

The exception I think is things that are more than one rank because there are separate tables for each rank.

WaigePilson commented 1 year ago

@Nicole-Ridgwell-NMMNHS @KatherineLAnderson I think I agree with Nicole, use the name which is most common/recent if possible and relegate other names to remarks.

Admittedly, this will be somewhat subjective as Geolex doesn't always list usages in the literature and we aren't all experts in the hundreds of formations which appear in our data. Plus, sometimes the most common/widespread name is hard to say. Greenhorn is a great example, as "Greenhorn Limestone of Colorado Group" is the most widely used (10 states) but "Greenhorn Formation" is arguably the most generic, and should probably be the one we use.

In cases where the name was previously applied at a different rank (e.g., Greenhorn also lists a Greenhorn Limestone Member of Colorado Shale) I would say we still try to avoid adding this as a member to that code table and instead update our records to "Greenhorn Formation"--however, this is only possible when we are familiar with the formations (which we aren't always).

KatherineLAnderson commented 1 year ago

I think the most straightforward solution would be to strip lithologies and use the most generic name, then add the lithology to remarks. Exceptions would be where there is a well-supported argument that a name with lithology included is the accepted name in the literature (e.g., Pierre Shale Formation).

My concern about going with the name used in the most states is, for example, let's say I have a sample from the "Greenhorn Shale". The sample is indeed a piece of shale. But "Greenhorn Limestone" is used by the most states, thus my piece of shale would now be assigned to a formation with limestone in the name which I think is confusing (although maybe isn't problematic to geologists?). Using "Greenhorn formation" would be, as Paige said, the most generic--and I agree should be the one we should probably use.

There's also examples where the accepted name of a formation previously had a lithology included, but now the accepted name does not include the lithology (e.g., the Niobrara Formation used to be called the Niobrara Chalk Formation).

KatherineLAnderson commented 1 year ago

@Nicole-Ridgwell-NMMNHS @WaigePilson Any more thoughts on this?

WaigePilson commented 1 year ago

I think there is a general consensus that we should use the most generic formation name when adding new formations to this code table (e.g., "Greenhorn Formation"), and if the original documentation uses a more specific name (e.g., "Greenhorn Limestone Formation") include that in remarks or some other field for future reference.

I think we've also reached a consensus that "floras" and "faunas" will go into biozones (full discussion here)

The conversation about adding an informal lithostratigraphy table kind of dropped off--@KatherineLAnderson did you start a separate thread for this?

Is there a more up to date list of our additions than UWBM.PB.VP.Geology_v2.xlsx? Do we need to do any more cleanup of this data before these values can be added to the appropriate code tables, or is this ready to go?

Note, I will likely have additions to this list of values to add to the geology code tables in the near future as I'm beginning to format our Locality data for import.

Jegelewicz commented 1 year ago

Everything from the file has been added except

D1 Sequence - going to informal litho? Dakota Formation - https://github.com/ArctosDB/arctos/issues/6702 Greenhorn Formation - https://github.com/ArctosDB/arctos/issues/6703 Moreno Formation - https://github.com/ArctosDB/arctos/issues/6704 Posidonienschiefer Formation - https://macrostrat.org/sift/#/strat_name_concept/39006 Posidonia Shale Formation [ link ] - I don't know what you guys want to do? Santa Margarita Sandstone Formation - https://ngmdb.usgs.gov/Geolex/Units/SantaMargarita_11775.html - Santa Margarita Formation [ link ] already in Arctos Manning Canyon Shale Formation https://ngmdb.usgs.gov/Geolex/Units/ManningCanyon_5979.html - use Manning Canyon Formation Winthrop Sandstone Formation https://ngmdb.usgs.gov/Geolex/Units/Winthrop_12294.html - use Winthrop Formation

And everything on the "informal" tab as we haven't opened an issue for the informal litho attribute request. https://github.com/ArctosDB/arctos/issues/6541#issuecomment-1665847130

KatherineLAnderson commented 1 year ago

D1 Sequence - going to informal litho?

Yes!

Posidonia Shale Formation [ link ] - I don't know what you guys want to do?

Posidonia Shale Formation works. I can modify on our end, which is using the german translation but it means the same thing.

Thanks Teresa!

KatherineLAnderson commented 1 year ago

The conversation about adding an informal lithostratigraphy table kind of dropped off--@KatherineLAnderson did you start a separate thread for this?

I did not, but I can.

Jegelewicz commented 1 year ago

I think we are done here - everything hanging has it's own issue. Closing

KatherineLAnderson commented 1 year ago

I have a handful of biozones that were not added-- or I'm not seeing them in the code tables/looking in the wrong place.

Cistecephalus Assemblage Zone | https://en.wikipedia.org/wiki/Cistecephalus_Assemblage_Zone Cynognathus Assemblage Zone | https://en.wikipedia.org/wiki/Cynognathus_Assemblage_Zone Lystrosaurus Assemblage Zone | https://en.wikipedia.org/wiki/Lystrosaurus_Assemblage_Zone Tapinocephalus Assemblage Zone | https://en.wikipedia.org/wiki/Tapinocephalus_Assemblage_Zone Tropidostoma Assemblage Zone | https://en.wikipedia.org/wiki/Tropidostoma_Assemblage_Zone

Can these be added? @Jegelewicz @Nicole-Ridgwell-NMMNHS

Jegelewicz commented 1 year ago

I need guidance as to whether these belong in biochron or biostratigraphic zone and either way their format will need ot change to fit the current terms in the table.

I'll let @Nicole-Ridgwell-NMMNHS weigh in.

KatherineLAnderson commented 1 year ago

I need guidance as to whether these belong in biochron or biostratigraphic zone

Biostratigraphic zone is more appropriate!

Nicole-Ridgwell-NMMNHS commented 1 year ago

Biostratigraphic zone is more appropriate!

Agreed.

Jegelewicz commented 1 year ago

OK - code table issues have been submitted for those. closing this.