Open dustymc opened 3 weeks ago
Here is the issue documenting how the types of fossilization were added: https://github.com/ArctosDB/arctos/issues/1976b. At the time, we basically just compiled a list of the types of fossilization and added them. Many of these, permineralization and recrystallization in particular are very common, but just aren't recorded unless it is a specific type, for example silicification or pyritization.
I think this needs to be revisited with a discussion among Arctos paleo users. Perhaps the best solution would be a dedicated free-text field for type of fossilization? The way we currently store this data isn't great BUT from our discussions in the paleo data working group it is useful to have a structured way to store type of fossilization other than just a remarks field.
I will be using thymol and paradichlorobenzene, I just haven't gotten around to bulkloading yet.
Describe what you're trying to do
https://github.com/ArctosDB/arctos/issues/4367 - there's a consensus (https://docs.google.com/document/d/1SGBZQTGdTMr11kpOUQQ4lfgjRRLMj5E3jeHVHtQBrMw/edit?tab=t.0#heading=h.q3pk226foene) to clean up part preservation, by way of https://github.com/ArctosDB/code-table-work/issues/76 in which I noticed a (common, I thought!) value wasn't getting much use: Can we remove some low-usage and no-usage values?
Here are sub-thousands:
paradichlorobenzene is from https://github.com/ArctosDB/arctos/issues/6874, a year-old issue where this very question went unanswered.
recrystallization has no issue, I'll just zap it unless someone chimes in soon.
thymol is also a year old, same user - https://github.com/ArctosDB/arctos/issues/6873 - @wellerjes ??
permineralization is likely involved in https://github.com/ArctosDB/arctos/issues/7736 - @Nicole-Ridgwell-NMMNHS can we just do that if it still looks like a good idea?
@ArctosDB/arctos-code-table-administrators can we establish some sort of minimal-usage criteria in general, or for types, or just for this, or ??? If that was 100 records (which seems VERY low to me) it would result in a ~25% decrease in complexity. 1000 records (starting to seem more reasonable) would be a ~40% decrease.
I could also be convinced that any number is important and all must be standardized and discoverable, but there are costs to that (eg https://github.com/orgs/ArctosDB/discussions/7737) as well.
I believe the normalized data are in this case useful only for discovery - attribute==paradichlorobenzene makes the record discoverable by a researcher who's interested in paradichlorobenzene, while paradichlorobenzene (or even paradichlirobenzene - people are pretty good at recognizing even imprecise things) in some less-normalized place (eg part remarks) is suitable if the presence of paradichlorobenzene only matters after one has discovered the record. That is, I don't think moving these to a less-searchable place would have much impact on usability, but I also don't know why a one might wish to search for many of these values.
I'll also add this to https://docs.google.com/spreadsheets/d/1-QC9L8jbOzaG7vstp5iHyx8ivlyvr3BrtiCY_GBimAg/edit?gid=0#gid=0 - I'm not convinced that eg articulation has anything to do with preservation.