ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Feature Request - code table consolidation: chronostratigraphy #7404

Open Jegelewicz opened 8 months ago

Jegelewicz commented 8 months ago

Issue Documentation is http://handbook.arctosdb.org/how_to/How-to-Use-Issues-in-Arctos.html

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

There are currently 6 code tables for recording "formal" chronostratigraphy. As I have been working with a lot of paleo collections, I started to wonder if we really need this, especially in light of our newfound ability to add metadata in code tables.

Describe what you're trying to accomplish A clear and concise overview of the goals; why are you asking for this?

Reduce the number of code tables, make search and data entry easier.

Describe the solution you'd like How might we accomplish your goals?

Combine the formal chronostratigraphy tables into one and apply them to a single locality attribute "formal chronostratigraphy". I have drafted up my concept of the "new" formal chronostratigraphy code table in this Google Sheet.

This means that the forth grader looking for "Cretaceous" or the student entering data does NOT need to know that Cretaceous is a system/period, but the metadata in the code table means that one can find all of the system/period terms in Arctos.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Change nothing

Additional context Add any other context or screenshots about the feature request here.

We probably need to update these terms in any case to match a newer (or the latest) version of the ICS Chronostrat Chart either way

Priority Please assign a priority-label. Unprioritized issues gets sent into a black hole of despair.

@Nicole-Ridgwell-NMMNHS @WaigePilson @KatherineLAnderson @aklompma @ronaldeng

Nicole-Ridgwell-NMMNHS commented 6 months ago

I think I like this proposal, I think it would make it possible to add chronostratigraphy to specimen search results and it would make locality reports a bit easier for me, but I have a couple of potential discussion points:

Jegelewicz commented 6 months ago

Since we are using the International Commission on Stratigraphy as our authority, would it be better to call it "ICS chronostratigraphy" instead of "formal chronostratigraphy"?

I think that would make sense - we should also indicate which version is currently in place in the code table.

Should we add the time units to the individual code table entries, i.e. "Cretaceous System/Period" If we do that, to simplify, do we just use the chronostrat or geochron unit, i.e. "Cretaceous System" or "Cretaceous Period"?

Simpler would be better if that is a possibility and would work for everyone.

Nicole-Ridgwell-NMMNHS commented 6 months ago

From code table meeting: If we could pull term type and higher units from the metadata that would be great.

WaigePilson commented 6 months ago

I support this! Moving to one code table seems like a better world. I agree with Teresa and Nicole's comments about calling the code table "ICS chronostratigraphy vXXX" and adding the unit e.g., "Cretaceous System/Period" (OR adding the unit/type as a separate field from the attribute value).

I'll read through Teresa's spreadsheet and hopefully @KatherineLAnderson and @ronaldeng can help vet as well!

Jegelewicz commented 6 months ago

@Nicole-Ridgwell-NMMNHS @WaigePilson @KatherineLAnderson @ronaldeng Please review the Google sheet and let me know if there are any issues (or correct them!).

Jegelewicz commented 6 months ago

Once we have agreement that the code table seems good, we can place it in test and see how it functions.

ronaldeng commented 6 months ago

Yes, this is a great suggestion! The code table looks good. I agree that it should identified as the "International Chronostratigraphic Chart, v.2023/09"

dustymc commented 6 months ago

I'm wondering if this should be expanded to "rocks-n-dirt stuff"?

Now a user can search one thing for 'permian' and get records that use any/all the stuff in eg https://arctos.database.museum/info/ctDocumentation.cfm?srch_val=permian. (And of course a more focused user could also limit to exact match and specific term type or whatever).

If we're gonna unify, maybe we should UNIFY! Does the litho structure do anything that @Jegelewicz 's "flat taxonomy" approach can't?

"International Chronostratigraphic Chart, v.2023/09"

I relatively sure that's some sort of metadata; we're not going to toss the table out when the next chart comes out.

Records needing linked to that particular chart (eg if "Permian according to THIS" and "Permian according to THAT" are functionally different things) would change things in some way, let me know if that's the case and we'll find a way to get ahead of it.

Jegelewicz commented 6 months ago

Does the litho structure do anything that @Jegelewicz 's "flat taxonomy" approach can't?

Don't think I didn't think about that, but the litho structure isn't that coordinated - a member may be in a formation in one location, but not in another....

Records needing linked to that particular chart (eg if "Permian according to THIS" and "Permian according to THAT" are functionally different things) would change things in some way

I have also thought about this and it may end up being the death of this because older references to some chronostratum may end up being misleading under a new definition. There are ways we could deal with this (geology remarks) whenever we update the code table to reflect a new version of the chart,but that needs people who understand the terms better than me to suss out.

WaigePilson commented 6 months ago

I'm wondering if this should be expanded to "rocks-n-dirt stuff"?

I agree with Teresa's comments. While this would be amazing, I don't think we could do it in practice. There isn't a unified source for all lithology (geolex is very good for the U.S. but even that is not perfect), lithologic units/structure names can vary across states and have gone through many revisions through time, names can be redundant, etc.

Records needing linked to that particular chart (eg if "Permian according to THIS" and "Permian according to THAT" are functionally different things) would change things in some way

I don't think this will be too much of an issue. In theory, ages are objective. Yes, the chronostrat chart changes (slightly) but that won't effect most instances of use in our collections work. And if it does, I support using the most recent chronostrat chart, and adding a comment if needed to note (e.g., "Author X identified this as Carboniferous in 1899, but dating by Author Y in 2024 puts it as Devonian")

WaigePilson commented 6 months ago

@Jegelewicz I took a read through your chart and it looks good to me! I didn't see any errors. Thanks for putting this together, what a lot of work!!

dustymc commented 6 months ago

a member may be in a formation in one location, but not in another

How's that's different than the current setup?

older references to some chronostratum may end up being misleading under a new definition

Still same as current model, no?

isn't a unified source for all lithology

Still not seeing why that matters. (Except the flat model might be better at indicating those sources.)

I think I need an example or something, from here this just looks like more-accessible metadata (which in either case can be included if useful or omitted if not), I can't see what doesn't fit.

chart changes ... adding a comment if needed to note

Just to get it out there, the obvious alternatives would be to embed that in the name ("Bla v1"), or to refer to data objects (which could carry both 'Bla' and 'v1') instead of strings. Both would be more complicated for everyone, but there is a path if we need it.

WaigePilson commented 6 months ago

@dustymc you're right that in our current set up there is a lot of messiness, however currently we are relying on the people making the code table request to do the hard work of vetting what their unit should be called, where it is, what valid references exist for it, etc. My point is that there are so many formations and members and whatnot that we cannot possibly come up with a unified table of all this data to just have as a sort of taxonomy for Arctos users to access. Even trying to turn our separate formation, member, etc. tables into one single table would require a lot of effort as someone would have to sit down and figure out which members belong to which units, where, etc. I suppose we could just pull all the lithostratigraphy tables into one, like you suggested originally, but without doing any of the work to link formations, members, etc. However, I don't see how that is much of an improvement over our current system. The really elegant and nice solution with Teresa's chronostratigraphy table is that we could add in parent/child time units in a "taxonomy".

To be clear, I am very supportive of creating a unified chronostratigraphy code table; as I mentioned above I looked at Teresa's google sheet and it looks great to me. I just think combining lithostratigraphy (at least if we wanted to do it in the same way, noting which geologic members belong to which formations) is too hard/messy.

dustymc commented 6 months ago

don't see how that is much of an improvement

For current data: It's not, it'd be a straight move, no change, no extra work necessary.

For current UI: I think it is an improvement in accessibility, in that I can just type 'Tj5' into the ONE search box and not have to pre-determine if that's a formation or member or litho or chrono or whatever https://arctos.database.museum/info/ctDocumentation.cfm?table=ctbiostratigraphic_zone is. IDK if that'll be at all useful for "us," but Arctos has a very large audience and a curious 4th grader (or me!) would benefit from such simplicity.

For future data: If there's some reason to add metadata, this thing...

Screenshot 2024-03-27 at 11 38 30

would just be a bit closer to the data in the UI, where it can be used or ignored.

I think I'm just proposing a zero-effort, zero-functional-change addition which would improve usability by removing complexity onto the existing proposal.

(It would also remove a few code tables, which might also improve usability, maybe let me drop the whole complicated thing in the screenshot above, which would be fabulous, and make certain DB costs one join cheaper, which probably isn't very significant).

I don't think it would have any negative effects on ya'll. It wouldn't substantially change how you enter or search for data, as far as I can see. It certainly wouldn't change anything about the creation/code table process. It wouldn't require metadata, but if someone wants to provide metadata I think it would be exactly as function as the current system (but without the complexity).

If there's absolutely no chance that anyone's ever going to try to add any metadata (eg fill in any extra columns in the spreadsheet) to any non-chrono thing ever for any reason, the simplification still looks to me like an improvement, and I still can't see any real cost (other than it being slightly different, which I suspect is more than balanced by being more homogenous).

MAYBE it's "messy" in that it would involve a lot of empty cells (at least initially, maybe/probably forever), but that doesn't seem too much of a problem.

Where am I getting lost?

supportive of creating a unified chronostratigraphy

I think/hope we're all agreed on that.

Jegelewicz commented 6 months ago

Where am I getting lost?

I see your points and I'll try to set up another sheet to demonstrate. The one real difference to me might be the fact that the chrono stuff in the current proposal has an authority and the rest of the stuff can come from any number of "authorities".

KatherineLAnderson commented 6 months ago

Coming in late to this conversation, but this seems like it would vastly improve searchability.

Paige made some great points about adapting this to lithostratgraphy, which in general is just "messier" than chronostrat, because (at least in part) there is no authority as Teresa mentions. Why not try this out with chronostrat, and see how it works first? Lithostrat should be its own conversation.

Funny enough, in an activity that my students did where they explored multiple online specimen databases, one of them said that having multiple fields for searching specific info (e.g., a box each for searching era, system/period, epoch, etc) was "outdated" UX when compared to other databases where you could search for many different types of information in one place.

dustymc commented 6 months ago

try this out with chronostrat

One reason would be if litho (or "etc.") needs some different/additional structure. We could certainly migrate in pieces, but I think considering the big picture from the start will lead to happier places (or a happier trail to the same place, maybe). (And if this is just a dumb idea I'd rather kill it now than drag it back up later!)

many different types of information in one place.

The extreme of that is https://github.com/ArctosDB/arctos/discussions/6524, but that's HARD - google has literally made a couple trillion dollars by doing a really great (comparatively, anyway) job with full-text search.

Knowing that ThingA is a value under SearchZ is usually fine for "us" - we put it there, we know where it is, we don't want the other ThingA that's over in SearchY, and will just be frustrated if we get all of the ThingA (false positives).

"Not-us" is I think usually happy to deal with a few false positives if they don't have to figure out which of the maybe-dozens of nearly identical (to them) options their desired ThingA lives under. (They also have a tendency to find SearchY-->ThingA, assume that's all of it, and wander off before discovering the SearchZ-->ThingA they came for.)

I think this is the best of both worlds, the question is how big the world is (and I'm trying to stretch it!). With what Teresa's proposed a casual user might just search "Jurassic" and end up with https://arctos.database.museum/info/ctDocumentation.cfm?table=ctchronostrat_series_epoch#lower_jurassic and https://arctos.database.museum/info/ctDocumentation.cfm?table=ctchronostrat_system_period#jurassic and whatever else happens to be out there at the moment, while "we" could qualify (only in system_period) and constrain (equals, not like) as necessary.

If this can extend beyond chronostratigraphy, then a naive user with a search term (Bonita Springs Marl, Anacacho, Tapinocephalus, Sharamurunian, etc.) might find something with that same singular "paleostuff" search option, however we've filed it. Same idea, bigger world.

(And maybe that's not what anyone wants and this is all my ignorance showing through...)

unified source

That's part of why I like this - there's a slot for source, we don't have to embed links and such (and I just spent the last several hours cleaning up HTML in agents - most of it was malformed in some way!). If eg formation doesn't end up in some new fancier structure, someone should remember to file an issue to make it a bit more like https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecimen_part_name.

Nicole-Ridgwell-NMMNHS commented 6 months ago

I'm on board with Teresa's plan for chronostrat. I do have some off the cuff concerns about looping lithostrat into the same table.

Jegelewicz commented 6 months ago

OK everyone, check out the Unified Geologic Stratum tab.

Units that can be more than one type (for example, a unit that is a member in one state and a formation in another). I can see how it would be an advantage to be able to search on these in one field, but would there have to be some sort of special member/formation and formation/group unit type?

The type is embedded in the name already, AND they each carry a type, so I don't think this is an issue.

Would we be able to search on multiple things?

This might require more than a comma separated search if you want to find things by combination of value and type, but we do this for taxonomy, so I would think it would be possible. @dustymc

Some general concern about managing a massive code table with different types of things in it.

A valid concern, but we maintain taxonomy, so.....

How would these values show up in specimen search results and download flat attributes on locality search results? If they're all smashed into one field, that will not work at all. This is my biggest concern. I need to have separate columns for chrono and lithostrat for my data pulls.

Just UI? A report for this stuff? @dustymc

I do believe that if we made the effort to flesh this taxonomy out, we would have an amazing tool for all the people interested in geologic strata. We will have built the thing that PBDB or Macrostrat should be. Anyone could use it and maybe someday it becomes its own actual database maintained by people who have dedicated knowledge and resources.

dustymc commented 6 months ago

special member/formation and formation/group unit type?

Currently that's structure - we'd need a new code table, something like https://arctos.database.museum/info/ctDocumentation.cfm?table=ctlithostratigraphic_formation / https://arctos.database.museum/info/ctDocumentation.cfm?table=ctlithostratigraphic_group, and I would fully expect some arbitrariness. (If CollectionA get's there first it's formation, if CollectionB it's group, unless the individual has some awareness in which case it's newthing.)

In a merged thing, that's data - not structure. We introduce a new "hell if I know..." value for Term Type rather than building a new code table. Probably still gets used arbitrarily, but it would do so from one PLACE - at least 'not us' could find it.

search on multiple things?

"Just UI" (I think, we should be developing use cases for anything that gets serious consideration - I'll let ya'll decide if this is that or not).

How would these values show up

Same as above - tell me how you want them to show up and we can figure out what that means/requires/whatever.

search on multiple things?

Can someone quantify that? Here's some current counts for fun:


arctosprod@arctos>> select count(*) from ctlithostratigraphic_group ;
 count 
-------
   129
(1 row)

arctosprod@arctos>> select count(*) from ctlithostratigraphic_formation;
 count 
-------
  1194
(1 row)

arctosprod@arctos>> select count(*) from taxon_name;
  count  
---------
 3523209
(1 row)

arctosprod@arctos>> select count(*) from taxon_term;
   count   
-----------
 293179436
WaigePilson commented 6 months ago

@Jegelewicz I looked at your Unified Geologic Stratum--this helps to clarify what a unified "geologic lithostratum" table might look like, thank you! I think seeing this together, I am more on board with combining all the formal lithostratigraphy tables into one.

I do have a few more thoughts: 1) I'd strongly vote against combining chronostrat and lithostrat in this table. "time" terms and "rock" terms are kept very strictly separate in geology, and combining in one table would be cumbersome and potentially very confusing for researchers and other experts. I'd keep "unified formal chronostrat" as one code table, and "unified formal lithostrat" as another (just remove anything from the International Chronostratigraphic Chart". Maybe this is not what you intended with your suggestion though! 2) Along the lines of Dusty's comment:

CollectionA get's there first it's formation, if CollectionB it's group, unless the individual has some awareness in which case it's newthing

I guess this is how this already happens. For instance, I'm just about to submit a code table request to add "Bracklesham Group", but that unit is also sometimes referred to as "Bracklesham Beds" (that is how it was entered into our old database; https://en.wikipedia.org/wiki/Bracklesham_Group)--but since it is more commonly accepted as a group I was going to submit it as a lithostratigraphic_group. We can always change the name/type in the future or add descriptions to explain oddities like this.

3) Lastly, to the comment about searching for multiple "things" I think this is absolutely one of the biggest benefits of a unified table. A couple of use cases off the top of my head: -Someone has a publication from 1850 referencing the floras from the Venado. They don't know if this is a formation, bed, group, or member. Now they don't have to search all four code tables during data entry. And if they need to add it as a new code table addition, but they're not sure if it's best called formation or member, they can be more flexible in what "type" they assign potentially. -I am searching for fossils from the Tullock Member of the Fort Union Formation, but sometimes people call it the Tullock Member and sometimes they call it the Tullock Formation. I can search this one field for "Tullock" and get all the records, rather than cross referencing each of the possible locality attributes to figure out which it's stored under (i.e., whether in Arctos it's called a Formation or Member). -I'd like to search for the Smith Member of the Blackstone Formation, but there are other "Smith Member" units associated with other formations. I hope/assume that with this unified table, I could search this attribute for Smith AND Blackstone? (currently you'd search lithostratigraphic member for Smith and llithostratigraphic formation for Blackstone). When you look at or download a .csv of the locality attributes you'd see the "type" to know what the member and what the formation was. To get at Nicole's concern, I'd assume/hope that when you download the .csv of locality attributes you'd just get multiple "lithostratigraphic unit" attributes for that one locality, one with "Smith Member" and one with "Blackstone Formation".

ronaldeng commented 6 months ago

Just to add to Paige’s comment: “I'd strongly vote against combining chronostrat and lithostrat in this table. "time" terms and "rock" terms are kept very strictly separate in geology, and combining in one table would be cumbersome and potentially very confusing for researchers and other experts. I'd keep "unified formal chronostrat" as one code table, and "unified formal lithostrat" as another (just remove anything from the International Chronostratigraphic Chart". Maybe this is not what you intended with your suggestion though!” It’s important to reiterate that chronostratigraphic units are defined as all rocks, regardless of composition, formed within a specified time span; i.e., these rock units are synchronous. Fortunately, the International Commission on Stratigraphy provides us a formal chronostratigraphic chart. Lithostratigraphic units are based on lithology. These units provide clues for approximate time correlation, but they are not necessarily synchronous. Many unit names are regional and/or anecdotal.

From: Paige Wilson Deibel @.> Sent: Tuesday, April 9, 2024 3:34 PM To: ArctosDB/arctos @.> Cc: Ron Eng @.>; Mention @.> Subject: Re: [ArctosDB/arctos] Feature Request - code table consolidation: chronostratigraphy (Issue #7404)

@Jegelewiczhttps://urldefense.com/v3/__https:/github.com/Jegelewicz__;!!K-Hz7m0Vt54!iqL6R6TY45pwhwHnZk9IekSGmMCNOfvdk-AUtbTABNWko141udrppWVMjSHf8Z4qNVCwfLeqbuVlLA4ilSFfuQ$ I looked at your Unified Geologic Stratum--this helps to clarify what a unified "geologic lithostratum" table might look like, thank you! I think seeing this together, I am more on board with combining all the formal lithostratigraphy tables into one.

I do have a few more thoughts:

  1. I'd strongly vote against combining chronostrat and lithostrat in this table. "time" terms and "rock" terms are kept very strictly separate in geology, and combining in one table would be cumbersome and potentially very confusing for researchers and other experts. I'd keep "unified formal chronostrat" as one code table, and "unified formal lithostrat" as another (just remove anything from the International Chronostratigraphic Chart". Maybe this is not what you intended with your suggestion though!
  2. Along the lines of Dusty's comment:

CollectionA get's there first it's formation, if CollectionB it's group, unless the individual has some awareness in which case it's newthing

I guess this is how this already happens. For instance, I'm just about to submit a code table request to add "Bracklesham Group", but that unit is also sometimes referred to as "Bracklesham Beds" (that is how it was entered into our old database; https://en.wikipedia.org/wiki/Bracklesham_Group)--buthttps://urldefense.com/v3/__https:/en.wikipedia.org/wiki/Bracklesham_Group)--but__;!!K-Hz7m0Vt54!iqL6R6TY45pwhwHnZk9IekSGmMCNOfvdk-AUtbTABNWko141udrppWVMjSHf8Z4qNVCwfLeqbuVlLA4pmXht1A$ since it is more commonly accepted as a group I was going to submit it as a lithostratigraphic_group. We can always change the name/type in the future or add descriptions to explain oddities like this.

  1. Lastly, to the comment about searching for multiple "things" I think this is absolutely one of the biggest benefits of a unified table. A couple of use cases off the top of my head: -Someone has a publication from 1850 referencing the floras from the Venado. They don't know if this is a formation, bed, group, or member. Now they don't have to search all four code tables during data entry. And if they need to add it as a new code table addition, but they're not sure if it's best called formation or member, they can be more flexible in what "type" they assign potentially. -I am searching for fossils from the Tullock Member of the Fort Union Formation, but sometimes people call it the Tullock Member and sometimes they call it the Tullock Formation. I can search this one field for "Tullock" and get all the records, rather than cross referencing each of the possible locality attributes to figure out which it's stored under (i.e., whether in Arctos it's called a Formation or Member). -I'd like to search for the Smith Member of the Blackstone Formation, but there are other "Smith Member" units associated with other formations. I hope/assume that with this unified table, I could search this attribute for Smith AND Blackstone? (currently you'd search lithostratigraphic member for Smith and llithostratigraphic formation for Blackstone). When you look at or download a .csv of the locality attributes you'd see the "type" to know what the member and what the formation was. To get at Nicole's concern, I'd assume/hope that when you download the .csv of locality attributes you'd just get multiple "lithostratigraphic unit" attributes for that one locality, one with "Smith Member" and one with "Blackstone Formation".

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ArctosDB/arctos/issues/7404*issuecomment-2046149365__;Iw!!K-Hz7m0Vt54!iqL6R6TY45pwhwHnZk9IekSGmMCNOfvdk-AUtbTABNWko141udrppWVMjSHf8Z4qNVCwfLeqbuVlLA6i8udSAA$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/A6Y6PIUJWE6K2FHHH7CBHW3Y4RUFFAVCNFSM6AAAAABDFRL2BOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBWGE2DSMZWGU__;!!K-Hz7m0Vt54!iqL6R6TY45pwhwHnZk9IekSGmMCNOfvdk-AUtbTABNWko141udrppWVMjSHf8Z4qNVCwfLeqbuVlLA6M-r6kSg$. You are receiving this because you were mentioned.Message ID: @.**@.>>

Nicole-Ridgwell-NMMNHS commented 6 months ago

I'd strongly vote against combining chronostrat and lithostrat in this table. "time" terms and "rock" terms are kept very strictly separate in geology, and combining in one table would be cumbersome and potentially very confusing for researchers and other experts. I'd keep "unified formal chronostrat" as one code table, and "unified formal lithostrat" as another (just remove anything from the International Chronostratigraphic Chart".

I completely agree with this.

Re - search results for lithostratigraphy

Under this system, would it be possible to separate the different lithostratigraphy unit types into separate columns in specimen search results and in download flat attributes in find locality?

WaigePilson commented 6 months ago

Under this system, would it be possible to separate the different lithostratigraphy unit types into separate columns in specimen search results and in download flat attributes in find locality?

This is a great point Nicole! I hadn't thought of this (or understood your previous question on this). I just tried pulling a flattened attribute csv for some localities I have where I entered multiple lithostratigraphic_formation attributes (such as this one: https://arctos.database.museum/place.cfm?action=detail&locality_id=12201941) to test, and the csv has one column for lithostratigraphic_formation with "Collón Curá Formation; Rio Negro Formation".

I completely agree that it might be problematic to produce csv with one column for lithostratigraphic_unit with e.g., "Tullock Member; Fort Union Formation" or one column for chronostratigraphic_unit with e.g., "Cenozoic Era; Neogene Period; Miocene Epoch". I suppose it's workable so long as it's consistent, e.g., it would never spit out "Neogene Period; Cenozoic Era; Miocene Epoch". However, is this something that can be coded? (i.e., the system knows to spit out the concatenation as group>formation>member or era>period>epoch>age) Or, even better, (as Nicole suggests) is it possible to have the proposed lithostratigraphic_unit and chronostratigraphic_unit attributes spit out in the flattened csv as multiple columns?

Jegelewicz commented 5 months ago

It seems like everyone agrees on the consolidation of the chronostrat code tables and the lithostrat tables but there are questions about output.

@dustymc could you respond to

Under this system, would it be possible to separate the different lithostratigraphy unit types into separate columns in specimen search results and in download flat attributes in find locality?

I suppose it's workable so long as it's consistent, e.g., it would never spit out "Neogene Period; Cenozoic Era; Miocene Epoch". However, is this something that can be coded? (i.e., the system knows to spit out the concatenation as group>formation>member or era>period>epoch>age) Or, even better, (as Nicole suggests) is it possible to have the proposed lithostratigraphic_unit and chronostratigraphic_unit attributes spit out in the flattened csv as multiple columns?

Also, this makes me think that informal chronostrat should just be added to formal. One place to search all chrono and one for all litho? Thoughts?

I would like to resolve this soon-ish in order to present it at SPNCH (whatever we decide to do).

dustymc commented 5 months ago

spit out in the flattened csv

At the limits of the model? No, not even close, a literally infinite number of things, which includes one thing being repeated an infinite number of times, doesn't easily get Excelified. With current-ish data? Yea, probably.

Nothing about that is in any way different than the current authority, which AFAIK is all this is proposing to change.

the system knows to

Not unless someone tells it - eg by adding a relative_sort_order column to the proposed code table. (And telling me what you want to do instead of how you want to do it tends to catch this sort of thing much better.)

Nicole-Ridgwell-NMMNHS commented 4 months ago

Also, this makes me think that informal chronostrat should just be added to formal. One place to search all chrono and one for all litho? Thoughts?

I have some minor concerns, but nothing I think that actually gets in the way of this. The biggest one being that I like to ensure that a locality record with informal chronostrat also has an ICS strat term. This is less obvious with them in the same table, but, for us at least, I am the only one who enters localities, so I can deal with it.

Nicole-Ridgwell-NMMNHS commented 4 months ago

The output is definitely a concern for me with lithostratigraphy. The output could be improved for chronostrat, but I don't think the unified table will make much of a difference at the moment. We already only get lowest level chronostrat in the locality download.

Jegelewicz commented 4 months ago

Not unless someone tells it - eg by adding a relative_sort_order column to the proposed code table. (And telling me what you want to do instead of how you want to do it tends to catch this sort of thing much better.)

Added

I have also added the informal chronostrat to the tab Unified Chronostrat - https://docs.google.com/spreadsheets/d/1Cb-uHVpasdQiWLKR4NFqJtceYbKoAgO2giY_VbNFykg/edit#gid=1141413387

Note that I have updated definitions for some of the informal terms. Lower Permian is a mystery to me! It is impossible for this to be perfect, the search terms in particular are likely missing something somewhere. If anyone wants to read all of that, then yay! Please look this over - if we think this can get us to a happy place, then we can request implementation in test.

@Nicole-Ridgwell-NMMNHS @KatherineLAnderson @WaigePilson @ronaldeng

Nicole-Ridgwell-NMMNHS commented 2 months ago

Ok, I did a pretty thorough check of @Jegelewicz 's Unified Chronostrat, including some excel manipulation to check spelling. The only mistake I found was that the search terms for Serpukhovian, Bashkirian, and Moscovian have higher-level units.

Jegelewicz commented 2 months ago

The only mistake I found was that the search terms for Serpukhovian, Bashkirian, and Moscovian have higher-level units.

Fixed

Jegelewicz commented 2 months ago

This is ready to test, but blocked by https://github.com/ArctosDB/internal/issues/326