ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Request - part preservation - little-used and unused values #8214

Open dustymc opened 3 weeks ago

dustymc commented 3 weeks ago

Describe what you're trying to do

https://github.com/ArctosDB/arctos/issues/4367 - there's a consensus (https://docs.google.com/document/d/1SGBZQTGdTMr11kpOUQQ4lfgjRRLMj5E3jeHVHtQBrMw/edit?tab=t.0#heading=h.q3pk226foene) to clean up part preservation, by way of https://github.com/ArctosDB/code-table-work/issues/76 in which I noticed a (common, I thought!) value wasn't getting much use: Can we remove some low-usage and no-usage values?

Here are sub-thousands:


        part_preservation         |    c    
----------------------------------+---------
 mounted                          |     984
 original tissue                  |     849
 freeze-dried                     |     652
 fossil imprint                   |     623
 formalin 10% buffered            |     460
 mummified                        |     454
 Dietrich's solution              |     368
 isopropanol 70%                  |     354
 cleared and stained              |     331
 sectioned                        |     231
 petrified                        |     191
 ethanol 90%                      |     187
 Riker mount                      |     186
 fossil, compression              |     165
 glutaraldehyde                   |     163
 DMSO                             |     139
 fossil mold, internal            |     100
 replacement                      |      96
 fossil cast                      |      87
 acetone                          |      80
 Monarch elution buffer           |      58
 Bouin's solution                 |      44
 buffer                           |      36
 paraffin-embedded                |      36
 phosphate buffer                 |      35
 propylene glycol                 |      34
 paraformaldehyde                 |      33
 heparin                          |      26
 guanidinium thiocyanate          |      25
 SEM stub, thin section           |      21
 terpineol                        |      18
 fossil mold, external            |      18
 injection, latex vascular        |      17
 articulated                      |      17
 formalin-fixed | isopropanol 45% |      14
 carbonization                    |      11
 viral transport medium           |      10
 permineralization                |       1
 arsenic                          |       1
 tryptic soy broth 15% glycerol   |       1
 thymol                           |       0
 recrystallization                |       0
 paradichlorobenzene              |       0

paradichlorobenzene is from https://github.com/ArctosDB/arctos/issues/6874, a year-old issue where this very question went unanswered.

recrystallization has no issue, I'll just zap it unless someone chimes in soon.

thymol is also a year old, same user - https://github.com/ArctosDB/arctos/issues/6873 - @wellerjes ??

permineralization is likely involved in https://github.com/ArctosDB/arctos/issues/7736 - @Nicole-Ridgwell-NMMNHS can we just do that if it still looks like a good idea?

@ArctosDB/arctos-code-table-administrators can we establish some sort of minimal-usage criteria in general, or for types, or just for this, or ??? If that was 100 records (which seems VERY low to me) it would result in a ~25% decrease in complexity. 1000 records (starting to seem more reasonable) would be a ~40% decrease.

I could also be convinced that any number is important and all must be standardized and discoverable, but there are costs to that (eg https://github.com/orgs/ArctosDB/discussions/7737) as well.

I believe the normalized data are in this case useful only for discovery - attribute==paradichlorobenzene makes the record discoverable by a researcher who's interested in paradichlorobenzene, while paradichlorobenzene (or even paradichlirobenzene - people are pretty good at recognizing even imprecise things) in some less-normalized place (eg part remarks) is suitable if the presence of paradichlorobenzene only matters after one has discovered the record. That is, I don't think moving these to a less-searchable place would have much impact on usability, but I also don't know why a one might wish to search for many of these values.

I'll also add this to https://docs.google.com/spreadsheets/d/1-QC9L8jbOzaG7vstp5iHyx8ivlyvr3BrtiCY_GBimAg/edit?gid=0#gid=0 - I'm not convinced that eg articulation has anything to do with preservation.

Nicole-Ridgwell-NMMNHS commented 3 weeks ago

Here is the issue documenting how the types of fossilization were added: https://github.com/ArctosDB/arctos/issues/1976b. At the time, we basically just compiled a list of the types of fossilization and added them. Many of these, permineralization and recrystallization in particular are very common, but just aren't recorded unless it is a specific type, for example silicification or pyritization.

I think this needs to be revisited with a discussion among Arctos paleo users. Perhaps the best solution would be a dedicated free-text field for type of fossilization? The way we currently store this data isn't great BUT from our discussions in the paleo data working group it is useful to have a structured way to store type of fossilization other than just a remarks field.

wellerjes commented 2 weeks ago

I will be using thymol and paradichlorobenzene, I just haven't gotten around to bulkloading yet.