Closed dustymc closed 3 years ago
This is still a problem.
These are not used:
UAM@ARCTOS> select distinct attribute_type from ctattribute_type where attribute_type not in (select attribute_type from attributes);
ATTRIBUTE_TYPE
------------------------------------------------------------------------------------------------------------------------
brood parasite present
bursa width
embryo weight
forewing length
keywords
middle toe length
nottitle
numeric abundance
tooth length
tooth width
10 rows selected.
I will plan to delete them if nobody objects in the next few days.
Can we clean up or document the rest of the list? I think about half of it's misplaced "reproductive data" and the other half is something about parasites...
water temperature @ 183 air temperature @ 183
HMMMM. These "environmental attributes" seem like they belong with the collecting event....This is probably going to become a larger issue when the UTEP ants start going in. There is A LOT of environmental data with them (soil moisture, soil type, soil temperature).
I am overwhelmed with taxonomy, geology, and locality right now and not sure who else has time to take this on. If no one is worried about it right now, I say we put it on the back burner until some of the problems people are actively working on are resolved. @dustymc do you need it resolved for something pressing?
Yes some clearly need moved elsewhere if we ever get a better home for them. I am absolutely fine with using Attributes as a staging area; most anything can be denormalized there. I'm fine with obscure attributes - if there's a real chance someone's going to ask hard questions of tail base width then we should absolutely keep it, even if it's only used 17 times per decade. (But it needs documented - maybe those 17 determinations also represent 17 techniques and these data are therefore completely useless.) I think the temperature attributes are just new - not a problem. I am not OK with having many ways of doing something - of presenting confounded and unusable data to the world - and from here that's what a bunch of this looks like.
Not pressing, I was just answering an email regarding attribution and became overwhelmingly re-appalled at the attribute row of https://docs.google.com/spreadsheets/d/1ElIuKfljO48gaosR7Ml1irSzPdjxiOmhId81ybsrfMQ/edit#gid=432374024
Agree with all of the above! Any chance we can ping the person who created each attribute without a definition to get them to supply one?
ping the person who created each attribute without a definition
Not really - we started tracking who's creating authorities and requiring definitions at about the same time. I can do this though:
begin
for r in (select distinct attribute_type from ctattribute_type where description is null order by attribute_type) loop
dbms_output.put_line(r.attribute_type);
for c in (
select
guid_prefix,
count(*) cnt
from
attributes,
cataloged_item,
collection
where
attributes.collection_object_id=cataloged_item.collection_object_id and
cataloged_item.collection_id=collection.collection_id and
attributes.attribute_type=r.attribute_type
group by
guid_prefix
) loop
dbms_output.put_line(' ' || c.guid_prefix || ' @ ' || c.cnt);
end loop;
end loop;
end;
/
axillary girth
UAM:Mamm @ 1617
MSB:Mamm @ 1
bill length
CHAS:Bird @ 60
UWYMV:Bird @ 2
UCM:Bird @ 52
UTEP:Bird @ 1
brood patch
MLZ:Bird @ 1
DMNS:Bird @ 1
bursa width
carapace length
UTEP:Herp @ 1
UWBM:Herp @ 16
MVZ:Herp @ 10
caste
CHAS:Ento @ 8
UAM:Ento @ 26
KNWR:Ento @ 2
clutch size
MVZ:Egg @ 14839
UCM:Egg @ 175
MVZ:Bird @ 1
MLZ:Egg @ 26
DMNS:Egg @ 6793
crown-rump length
UAM:Mamm @ 341
DMNS:Mamm @ 111
NMU:Mamm @ 30
MSB:Mamm @ 1362
UCM:Mamm @ 16
MVZ:Mamm @ 3
curvilinear length
UAM:Mamm @ 1754
diploid number
UAM:Herb @ 456
UTEP:Herb @ 18
ear from crown
UNR:Mamm @ 2
UAM:Mamm @ 31
DMNS:Mamm @ 100
UCM:Mamm @ 514
MVZ:Mamm @ 270
MSB:Mamm @ 2
egg content weight
NBSB:Bird @ 314
eggshell thickness
NBSB:Bird @ 73
embryo weight
forewing length
gonad
DMNS:Bird @ 1
head width
UCM:Herp @ 18
MVZ:Herp @ 149
UWBM:Herp @ 2
hind foot without claw
UTEP:Mamm @ 1
UAM:Mamm @ 10
DMNS:Mamm @ 94
MSB:Mamm @ 413
hind limb length
MVZ:Herp @ 83
incubation stage
CHAS:Egg @ 1838
MVZ:Egg @ 14808
UCM:Egg @ 245
MLZ:Egg @ 24
DMNS:Egg @ 10
middle toe length
number of labels
UAM:Herb @ 113021
UAMb:Herb @ 7
ovum
UWYMV:Bird @ 1
DMNS:Bird @ 6
parts examined
MSB:Bird @ 1
MSB:Host @ 770
plastron length
UWBM:Herp @ 3
MVZ:Herp @ 6
MSB:Herp @ 6
snout-vent length
UTEPObs:Herp @ 2
DMNS:Herp @ 1
UAM:Herp @ 4
UTEP:Herp @ 3998
UCM:Obs @ 6
UTEP:HerpOS @ 580
UWYMV:Herp @ 1
UCM:Herp @ 92
MVZObs:Herp @ 2
MVZ:Herp @ 17749
MSB:Herp @ 760
UWBM:Herp @ 4398
soft part color
UWYMV:Bird @ 24
CHAS:Bird @ 264
MSB:Bird @ 1826
UCM:Bird @ 48
MVZ:Bird @ 1
DMNS:Bird @ 85
MLZ:Bird @ 253
soft parts
UWYMV:Bird @ 155
CHAS:Bird @ 66
MSB:Bird @ 10430
MVZ:Bird @ 20107
standard length
MSB:Fish @ 19893
UAM:Fish @ 1
UCM:Fish @ 6
UAMObs:Fish @ 1
tail base width
MVZ:Herp @ 17
tail condition
UTEP:Herp @ 2
UNR:Herp @ 3
MVZ:Herp @ 2141
UWBM:Herp @ 28
wing span
UAM:Mamm @ 2
UCM:Mamm @ 138
year class
UCM:Fish @ 4
PL/SQL procedure successfully completed.
ref: https://github.com/ArctosDB/data-migration/issues/71
Consider a "random structured non-core data" attribute to hold things like
[
{
"modified_by":"whoever",
"modified_date":"whenever"
}
]
which can do ~anything two new attributes can do.
removed some unused, closing for lack of interest
From https://github.com/ArctosDB/arctos/issues/1597
Ref: https://github.com/ArctosDB/arctos/issues/1620
My reservation here is that we now have ~200 attributes, half-ish undocumented (https://github.com/ArctosDB/arctos/issues/1450). Some (most!?!) of the "documentation" we do have is not useful for any purpose: "measured how?" or "Standard beak measurement for birds". These things have become numerous enough to start causing problems merely by existing. (Operators and researchers may not find what they're looking for, turning them all on causes serious performance problems, etc.)
This request is clearly data which can fit in Attributes. There's a good definition. Given enough of it, we could ask Arctos cool questions about horse teeth.
We could also push it to Media or structured data or similar, which would support the same questions but with a lot more work. ("unformatted measurements" is NOT a suitable target; these kinds of data are formatted.)
A "these things are Attributes" definition from the AWG would be very useful. (I'd probably vote to continue adding anything that fits and looking for solutions to the problems that causes, but I don't think this is my call.)
Here are existing Attributes by frequency of usage. Can anything that's not used much be removed or merged or ??