ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

part name cleanup #1131

Closed dustymc closed 6 years ago

dustymc commented 7 years ago

!!! IMPORTANT !!!

This issue is ongoing; most of what's here has been done. Scroll to the bottom for the latest.

!!! IMPORTANT !!!

can we change "fetus" to "whole organism" and add an age class?

can we change "embryo" to "whole organism" and add an age class?

can we change "other" to "unknown" (or vise-versa)?

can we change...

... to "media" (and add a remark???)

Can we change phalanges to phalange?

Can we move "partial ..." to condition and remove it from part name?


UAM@ARCTOS> select distinct part_name from ctspecimen_part_name where part_name like '%partial%';

PART_NAME
------------------------------------------------------------------------------------------------------------------------
partial organism (70% ethanol)
partial skeleton
partial organism (80% ethanol)
partial skeleton (dry)
partial strobila (slide)
partial organism (95% ethanol)

6 rows selected.
ekrimmel commented 7 years ago

no opinions on media, but I like the fetus/embryo/other proposed part name changes

AJLinn commented 7 years ago

I'm okay with the general term "media" with a remark to indicate the type. I don't know the difference between "photograph" and "image" anyway.

ccicero commented 7 years ago

I'm good with these changes.

atrox10 commented 7 years ago

Ditto on what Erica said.

On Tue, May 9, 2017 at 3:38 PM, Erica Krimmel notifications@github.com wrote:

no opinions on media, but I like the fetus/embryo/other proposed part name changes

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-300322004, or mute the thread https://github.com/notifications/unsubscribe-auth/AESS8c1NbNFx_IIiJ72RAgfmz4xiuHtYks5r4Or4gaJpZM4NVv_A .

-- Carol L. Spencer, Ph.D. Staff Curator of Herpetology & Researcher Museum of Vertebrate Zoology 3101 Valley Life Sciences Building University of California, Berkeley, CA, USA 94720-3160 atrox10@gmail.com or atrox@berkeley.edu 510-643-5778 http://mvz.berkeley.edu/

campmlc commented 7 years ago

I agree with these proposed changes.

On May 9, 2017 4:46 PM, "Carol" notifications@github.com wrote:

Ditto on what Erica said.

On Tue, May 9, 2017 at 3:38 PM, Erica Krimmel notifications@github.com wrote:

no opinions on media, but I like the fetus/embryo/other proposed part name changes

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-300322004, or mute the thread https://github.com/notifications/unsubscribe-auth/AESS8c1NbNFx_ IIiJ72RAgfmz4xiuHtYks5r4Or4gaJpZM4NVv_A .

-- Carol L. Spencer, Ph.D. Staff Curator of Herpetology & Researcher Museum of Vertebrate Zoology 3101 Valley Life Sciences Building University of California, Berkeley, CA, USA 94720-3160 atrox10@gmail.com or atrox@berkeley.edu 510-643-5778 <(510)%20643-5778> http://mvz.berkeley.edu/

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-300323143, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hFTpKPJKE4Lqqdr0vmV9H9Z3Nt_wks5r4Ox9gaJpZM4NVv_A .

Jegelewicz commented 7 years ago

Sounds good to me

dustymc commented 7 years ago

The straight vocabulary replace got a little out of control, so I made a spreadsheet. https://docs.google.com/spreadsheets/d/1qI6syTLbWGb7u3MZP9aYcuicSr1WBtCZv4GlfcorPCo/edit?usp=sharing

This is a proposal, not necessarily recommendations; if your users find value in being able to search for "left dentary" instead of just "dentary," (or whatever), we can talk. The intent in all cases would be to update the existing part to the proposed new part and add a comment of "part was given as {old_part_name}." (Other suggestions welcome, of course.)

AWG, can we set a firm timeline for comments?

Goals are twofold:

  1. reduce redundancy, but not at the cost of usability. If "thing" and "thing (modifier)" or "thing" and "other thing" are close enough to the same that most users will want to find them both, merge them. If they're usefully distinct, do not.

  2. Help with sorting.

is easy to work with, while

is difficult.

Random confusing stuff:

dustymc commented 7 years ago

Media done (-23 part names).

dustymc commented 7 years ago

I deleted this comment above:

Adding Attributes is leading to confusing data when eg, a fetus is cataloged as a part of it's parent, which seems to be most of these. Instead, I'll update part remark of the new "whole organism"/former "fetus" (etc.) part.

I still don't like those as parts, but I think it's less confusing than the proposed update, which would lead to things like "parts=heart, liver, skeleton, skin, whole organism."

Clever ideas for handling this better?

dustymc commented 7 years ago

phalanges

We also have "spine" and "spines" - I propose we eliminate all non-singular part names and document that (eg so they don't get re-introduced). Someone stop me now if there's a reason to not do this....

Can we also get rid of all "...(s)" part names ("ectoparasite(s) (ethanol)")? The "(s)" bit can be derived from lot count and does not seem necessary or useful in part name.

campmlc commented 7 years ago

Yes, please get rid of all plural (s) parts and convert to singular.

On Jul 17, 2017 11:39 AM, "dustymc" notifications@github.com wrote:

phalanges

We also have "spine" and "spines" - I propose we eliminate all non-singular part names and document that (eg so they don't get re-introduced). Someone stop me now if there's a reason to not do this....

Can we also get rid of all "...(s)" part names ("ectoparasite(s) (ethanol)")? The "(s)" bit can be derived from lot count and does not seem necessary or useful in part name.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-315804751, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hBPTAYriddAGzD3fJQ5V1pzHz-Diks5sO40qgaJpZM4NVv_A .

ekrimmel commented 7 years ago

Assorted opinions on above:

campmlc commented 7 years ago

Agree, with exception that we need to keep observation. We use in MSB host for a catalog record of the host based on parasite data. No other part info is available at the time of cataloging, and we don't want people requesting these records for loans.

On Jul 18, 2017 10:49 AM, "Erica Krimmel" notifications@github.com wrote:

Assorted opinions on above:

  • pro all part names being singular
  • pro "thing (fossilized)" vs. "fossil"
  • pro ditching "observation" as that should be the kind of specimen event
  • pro all of the part clean-up suggestions you have in the spreadsheet https://docs.google.com/spreadsheets/d/1qI6syTLbWGb7u3MZP9aYcuicSr1WBtCZv4GlfcorPCo/edit?usp=sharing. When you make these part clean-up changes you'll check for and add if necessary the age classes, yes?
  • I could see "body" meaning "whole organism minus head" (which I've seen here and there in the collections) but I think that part could better be represented as "whole organism" with a remark that it's missing the head.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316107023, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hCAdGjyb3NcKR7Pxme6eIhDg3Rukks5sPNNngaJpZM4NVv_A .

dustymc commented 7 years ago

singular

Yay, working on that now.

age classes

No, I've given up on that for the moment - somewhere ^^ up there^^ I said

I still don't like those as parts, but I think it's less confusing than the proposed update, which would lead to things like "parts=heart, liver, skeleton, skin, whole organism."

body

I dislike remarks for "important bits missing" - it's not searchable. (I may ultimately dislike that less than other ideas, but still...)

campmlc commented 7 years ago

I also support Dusty's suggestion of keeping important info out of remarks. Anything to clarify for potential users to reduce loan requests of inappropriate material.

On Jul 18, 2017 11:04 AM, "dustymc" notifications@github.com wrote:

singular

Yay, working on that now.

age classes

No, I've given up on that for the moment - somewhere ^^ up there^^ I said

I still don't like those as parts, but I think it's less confusing than the proposed update, which would lead to things like "parts=heart, liver, skeleton, skin, whole organism."

body

I dislike remarks for "important bits missing" - it's not searchable. (I may ultimately dislike that less than other ideas, but still...)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316112676, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hCXRE4BPUA7VZ06e-SEWXKAJU1EHks5sPNd8gaJpZM4NVv_A .

dustymc commented 7 years ago

catalog record of the host based on parasite data

I maintain that parts are "things to which one can stick barcodes" - they are/should be physical somehow-discrete THINGs.

The RECORDs may be useful for (data) loans. You've got a squirrel-parasite-host from THERE, might be a good place to look for squirrels....

You don't have/know of parts, so just don't enter any.

https://arctos.database.museum/info/ctDocumentation.cfm?table=CTSPECIMEN_PART_NAME "accepted place of collection" is probably appropriate, and part "observation" seems very confusing when mixed with event type observation which explicitly requires "No biological samples were taken."

dustymc commented 7 years ago

https://docs.google.com/spreadsheets/d/1WZMkrnyZe5hxRCI6kuQT1xagJtxscVwXMFaKZB1FMM8/edit?usp=sharing is an attempt at the singularized part names and definitions; everyone can edit.

Please review if possible, I'll try to update it later today. I don't think there are any significant changes in there, but hopefully it'll make things more readable/sortable.

@campmlc I don't know what "... (ethanol-fixed)" means - help?

@DerekSikes is "cryovial tissues" just "tissue"?

@ccicero is "parasitic eggs" just "egg"?

And "skeletal element(s)" still makes me twitchy for some reason. "Bone"?

dustymc commented 7 years ago

Slide-containing parts are at https://docs.google.com/spreadsheets/d/1rET-CZ5EMmYFLq0Zl26mD3WkOhBsM5Z3bWJKx3jxVNM/edit?usp=sharing. I don't know what any of that stuff is and need a lot of help - PLEASE edit!

campmlc commented 7 years ago

Please keep observation as a part until we can get more feedback from parasitology. There are currently 22,045 observation specimen records in MSB Host and another 13,206 in MSB Para. All are specimen event type = accepted place of collection because they were entered with shared collecting events with their parasites/hosts. We have been using part = observation because it allows us to record verbatim part info in remarks pending possibly locating an actual host or parasite specimen in a different collection. Eliminating part = observation requires a complete change in the model in current use for MSB Para and MSB Host.

http://arctos.database.museum/guid/MSB:Host:1240

On Tue, Jul 18, 2017 at 10:14 AM, dustymc notifications@github.com wrote:

catalog record of the host based on parasite data

I maintain that parts are "things to which one can stick barcodes" - they are/should be physical somehow-discrete THINGs.

The RECORDs may be useful for (data) loans. You've got a squirrel-parasite-host from THERE, might be a good place to look for squirrels....

You don't have/know of parts, so just don't enter any.

https://arctos.database.museum/info/ctDocumentation. cfm?table=CTSPECIMEN_PART_NAME "accepted place of collection" is probably appropriate, and part "observation" seems very confusing when mixed with event type observation which explicitly requires "No biological samples were taken."

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316115184, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hDjk7m_GdrLzSqb6Xk3E3S-QrqWZks5sPNmBgaJpZM4NVv_A .

dustymc commented 7 years ago

Please keep observation as a part until we can get more feedback

Sure, no problem, I just don't want it to fall off the radar - it's weird and needs eliminated or documented or something.

verbatim part info

How about part=skull, disposition=missing (or something more explicit and specialized even - "probably exists somewhere, but we don't have it...")?

And maybe we need an "available for loan" generalization of disposition for search, maybe up by the "tissues?" option:

EXCLUDE (on loan, missing, used up, ...)

campmlc commented 7 years ago

In the case of the example I sent, the actual skull is at the Smithsonian. However, several years ago we cataloged the observation from the Rausch ledger, including all the data and the linkage to the Rausch media. We don't want people thinking that our record is that actual skull - we just have the data online, and only recently linked our data record to the Smithsonian catalog number. This the model we have used for the entire Rausch collection and the basis for setting up the host and parasite collections in Arctos.

On Tue, Jul 18, 2017 at 11:21 AM, dustymc notifications@github.com wrote:

Please keep observation as a part until we can get more feedback

Sure, no problem, I just don't want it to fall off the radar - it's weird and needs eliminated or documented or something.

verbatim part info

How about part=skull, disposition=missing (or something more explicit and specialized even - "probably exists somewhere, but we don't have it...")?

And maybe we need an "available for loan" generalization of disposition for search, maybe up by the "tissues?" option:

EXCLUDE (on loan, missing, used up, ...)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316134978, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hMUNAlzJchqz4hy9P5Sbf84wCCIaks5sPOmbgaJpZM4NVv_A .

dustymc commented 7 years ago

We don't want people thinking that our record is that actual skull

I don't think I see a distinction. "Someone says there's a skull [and we never had it, or we've lost it, or it's used up, or it's on loan, or we gave it away] but we don't have it."

recently linked our data record to the Smithsonian catalog number

You haven't linked until their number does something (eg, opens a web page), and you haven't usefully linked until that "something" can support the "canid hosts of tapeworm" things that paired records in Arctos can. I think this is all a compelling reason to have some data in Arctos even if you don't "own" the specimens, but I still don't see what that has to do with weird parts. Maybe part disposition "observation" (=someone says they saw a skull) makes sense????

jldunnum commented 7 years ago

I'd like to also ask that when a part is listed as missing it still show up under parts. Currently when you do a search it doesn't appear at all if listed as missing. I'd like to see what is supposed to be there and then when you go into the individual record you can see that it currently is listed as missing.

dustymc commented 7 years ago

still show up under parts

658

campmlc commented 7 years ago

Ditto to Jon's request.

For most of the observational records in MSB Para and MSB Host, we do not know what part(s) may or may not exist since the media does not always record this. Observation based on media description is the best term for what we have. I don't see the problem with keeping observation as it is essential for at least two collections.

On Tue, Jul 18, 2017 at 11:41 AM, jldunnum notifications@github.com wrote:

I'd like to also ask that when a part is listed as missing it still show up under parts. Currently when you do a search it doesn't appear at all if listed as missing. I'd like to see what is supposed to be there and then when you go into the individual record you can see that it currently is listed as missing.


Jonathan L. Dunnum Ph.D. Senior Collection Manager Division of Mammals, Museum of Southwestern Biology University of New Mexico Albuquerque, NM 87131 (505) 277-9262 Fax (505) 277-1351

MSB Mammals website: http://www.msb.unm.edu/mammals/index.html Facebook: http://www.facebook.com/MSBDivisionofMammals

Shipping Address: Museum of Southwestern Biology Division of Mammals University of New Mexico CERIA Bldg 83, Room 204 Albuquerque, NM 87131


From: Mariel Campbell notifications@github.com Sent: Tuesday, July 18, 2017 11:27:32 AM To: ArctosDB/arctos Cc: Subscribed Subject: Re: [ArctosDB/arctos] part name cleanup (#1131)

In the case of the example I sent, the actual skull is at the Smithsonian. However, several years ago we cataloged the observation from the Rausch ledger, including all the data and the linkage to the Rausch media. We don't want people thinking that our record is that actual skull - we just have the data online, and only recently linked our data record to the Smithsonian catalog number. This the model we have used for the entire Rausch collection and the basis for setting up the host and parasite collections in Arctos.

On Tue, Jul 18, 2017 at 11:21 AM, dustymc notifications@github.com wrote:

Please keep observation as a part until we can get more feedback

Sure, no problem, I just don't want it to fall off the radar - it's weird and needs eliminated or documented or something.

verbatim part info

How about part=skull, disposition=missing (or something more explicit and specialized even - "probably exists somewhere, but we don't have it...")?

And maybe we need an "available for loan" generalization of disposition for search, maybe up by the "tissues?" option:

EXCLUDE (on loan, missing, used up, ...)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316134978, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hMUNA lzJchqz4hy9P5Sbf84wCCIaks5sPOmbgaJpZM4NVv_A .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Arct osDB/arctos/issues/1131#issuecomment-316136574, or mute the thread< https://github.com/notifications/unsubscribe-auth/AQe ngzFJ9GGNMChmtdkvMCzK9o8iwT0oks5sPOsDgaJpZM4NVv_A>.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316140513, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hEySLC2RTMnS08Ox99kB0W5Hwhibks5sPO44gaJpZM4NVv_A .

dustymc commented 7 years ago

do not know what part(s) may or may not exist

"Other" (and unknown and maybe some other stuff I haven't yet tracked down) exists for that.

don't see the problem

If parts are physical things, then "observation" clearly can't be that. If parts are something else, we should change our definition (and perhaps model). Mixing concepts, or doing two things with one "field," is usually a really great way to assure that the data can't answer the questions that one might want to ask of it. This is also confusing in the context of specimen-event observations; we're using one term to mean very different things.

That said, this is about No. 799 on my list of things we need to clean up, I don't think it's part of the clutter preventing us from figuring out what "tissues" means, and I'm happy enough to ignore it for now.

dustymc commented 7 years ago

...s and ...(s) parts are no more. I still need help with "slide" parts, https://docs.google.com/spreadsheets/d/1rET-CZ5EMmYFLq0Zl26mD3WkOhBsM5Z3bWJKx3jxVNM/edit#gid=0

There were some misses from the spreadsheet, I did this:

OLD NEW DEF
vial (70% ethanol) tissues vial (70% ethanol) tissue unidentified tissues in ethanol of unspecified concentration
ear bones ear bone unspecified bone from the ear
skeletal element(s) (dry) skeletal element (dry) part of an incomplete skeleton for which a count of the bones is not feasible, stored dry
tissue section(s) (slide) tissue section (slide) MLC HELP!
slide tissues tissue (slide) unidentified tissue on a slide

@DerekSikes I did NOT do anything with "pinned tissues" - is that real?!?

I also did nothing with "body part(s)" - can that be "unknown"?? (Is there another source of parts?!?) Here's who uses that:

UAM@ARCTOS> select distinct guid_prefix from specimen_part,cataloged_item,collection where specimen_part.derived_from_cat_item=cataloged_item.collection_object_id and cataloged_item.collection_id=collection.collection_id and part_name='body parts';

GUID_PREFIX
------------------------------------------------------------
CUMV:Bird
UCM:Bird
MVZ:Bird
DMNS:Bird
NMU:Bird
UCM:Herp
UWBM:Herp
dustymc commented 7 years ago

other-->unknown is https://docs.google.com/spreadsheets/d/1LDgB_Qr76-Jeid2yzErw7ehTyCRoPAa4nAegX_JYtjc/edit#gid=0 - I'll go ahead with that update if nobody stops me soon.

dustymc commented 7 years ago

These parts are not used. Any objections to deleting them? @KatherineLAnderson lots of paleo stuff here...

UAM@ARCTOS> select distinct part_name from ctspecimen_part_name where part_name not in (select part_name from specimen_part) order by part_name;

PART_NAME
------------------------------------------------------------------------------------------------------------------------
bark
blood (Longmire's buffer)
brain (70% ethanol)
carpus
cleithrum
conifer shoot
egg (50% isopropanol)
flower
frame
frontoparietal
gastrolith
gill cover
heart (DMSO/EDTA)
heart, kidney, liver, lung, spleen (alcohol)
hyomandibula
kidney (DMSO/EDTA)
liver (DMSO/EDTA)
lung (formalin-fixed, ethanol preserved)
muscle (DMSO/EDTA)
nail
nest (pinned)
opisthotic
orbitosphenoid
partial organism (80% ethanol)
partial organism (95% ethanol)
petrotympanic complex
postcranial skeleton (ethanol, formalin-fixed)
presphenoid
skin (ethanol/glycerin)
supracleithrum
tail tip (DMSO/EDTA)
talus
tissue (DMSO/EDTA/NaCl)
tissue (formalin-fixed)
triangular
tympanic
zygomatic plate
campmlc commented 7 years ago

Please keep the following - we will need them shortly:

rostellar hooks (slide) scolex (slide)

On Wed, Jul 19, 2017 at 10:10 AM, dustymc notifications@github.com wrote:

These parts are not used. Any objections to deleting them? @KatherineLAnderson https://github.com/katherinelanderson lots of paleo stuff here...

UAM@ARCTOS> select distinct part_name from ctspecimen_part_name where part_name not in (select part_name from specimen_part) order by part_name;

PART_NAME

bark blood (Longmire's buffer) brain (70% ethanol) carpus cleithrum cone conifer shoot egg (50% isopropanol) flower frame frontoparietal gastrolith gill cover heart (DMSO/EDTA) heart, kidney, liver, lung, spleen (alcohol) hyomandibula kidney (DMSO/EDTA) liver (DMSO/EDTA) lung (formalin-fixed, ethanol preserved) muscle (DMSO/EDTA) nail nest (pinned) opisthotic orbitosphenoid partial organism (80% ethanol) partial organism (95% ethanol) petrotympanic complex postcranial skeleton (ethanol, formalin-fixed) presphenoid rostellar hooks (slide) scolex (slide) skin (ethanol/glycerin) supracleithrum tail tip (DMSO/EDTA) talus tissue (DMSO/EDTA/NaCl) tissue (formalin-fixed) triangular tympanic zygomatic plate

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316437269, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hKoXKN-RXaStQ5CxQ_nCdwgR0cKOks5sPiqMgaJpZM4NVv_A .

dustymc commented 7 years ago

Thx, deleted from list

dustymc commented 7 years ago

Can we try to put the part-component first in the part-name? We currently have...

_Power skin_
stuff
other stuff
more stuff
_body skin_
stuff
other stuff
more stuff
_flat skin_
stuff
other stuff
more stuff
_skin_
_skin (50% isopropanol)_
stuff
other stuff
more stuff
_study skin_

and so I have to scroll through many hundreds of parts to find all the skins. If we had instead...

stuff
other stuff
more stuff
skin, Power
skin, body
skin, flat
skin
skin (50% isopropanol)
skin, study
stuff
other stuff
more stuff

it would be a LOT easier to tell at a glance what sort of "skin" material is available, and would greatly reduce the likelihood of creating "skin (study)" and "study skin" type functional duplicates.

campmlc commented 7 years ago

Great idea!

On Wed, Jul 19, 2017 at 10:41 AM, dustymc notifications@github.com wrote:

Can we try to put the part-component first in the part-name? We currently have...

Power skin stuff other stuff more stuff body skin stuff other stuff more stuff flat skin stuff other stuff more stuff skin skin (50% isopropanol) stuff other stuff more stuff study skin

and so I have to scroll through many hundreds of parts to find all the skins. If we had instead...

stuff other stuff more stuff skin, Power skin, body skin, flat skin skin (50% isopropanol) skin, study stuff other stuff more stuff

it would be a LOT easier to tell at a glance what sort of "skin" material is available, and would greatly reduce the likelihood of creating "skin (study)" and "study skin" type functional duplicates.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-316445599, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hP-gfSXpaBlTlhIsZeOoJIX1cS1Gks5sPjGjgaJpZM4NVv_A .

ekrimmel commented 7 years ago

Ditto, great idea

dustymc commented 7 years ago

OK, moving on with part-thing first. To figure out where to start, I extracted "words" from part names and thought it was sorta neat so: https://docs.google.com/spreadsheets/d/1wAVDsVQ9A2wCVc9e9NfGURrIF0IsCHR0vXuPx9SumyQ/edit#gid=619201898. It's very quick-n-dirty - we don't have 67 ears, but 67 ear + hEARt etc.

I also cleaned up a bunch of obvious stuff - we had RNALater and RNAlater, for example. For those of you creating parts, please be extra paranoid about introducing almost-duplicate terms.

dustymc commented 7 years ago

The most-obvious/problematic part names first are https://docs.google.com/spreadsheets/d/1aTzwOWxEuW8JJdPiQQKpSo7_i-m7sUhdgTz9WOIC52A/edit?usp=sharing. I don't think the changes are significant - they just sort better - so unless someone stops me I'll load it ASAP. Definitions do need your input - I'm not sure why we have about half of that stuff.

KatherineLAnderson commented 7 years ago

In the above list, request to keep: cone

dustymc commented 7 years ago

cone

Removed from the list - thx

dustymc commented 7 years ago

@KatherineLAnderson I've for dorsal vertebra mapped to vertebra, thoracic - please edit https://docs.google.com/spreadsheets/d/1aTzwOWxEuW8JJdPiQQKpSo7_i-m7sUhdgTz9WOIC52A/edit#gid=1331535064 if no. Also what's "flat rib"?

KatherineLAnderson commented 7 years ago

I added a definition for dorsal vertebra. "Flat rib" can be mapped to rib.

dustymc commented 7 years ago

https://docs.google.com/spreadsheets/d/1aTzwOWxEuW8JJdPiQQKpSo7_i-m7sUhdgTz9WOIC52A/edit#gid=1331535064 will load to Arctos Tomorrow, 2017-07-25 5PM Pacific.

  1. If two or more existing parts (PART_NAME) column are the same, merge them by making the values in NEWPARTNAME the same.
    1. rib-->rib (no changes)
    2. flat rib --> rib (data will change, "flat rib" will be removed from the code table)
  2. If they are not the same, or we can't tell from this, do not merge them, but change the value in NEWPARTNAME to the format of "{part_name}, modifiers (or modifiers)" when necessary - that is, put the part name first so this sorts better. (We may need to revisit what's parenthetical etc. later.)
  3. Add definitions to the DEF column when possible; this will load to Arctos.
  4. wat?!? column is things that need defined and possibly merged; it's just notes and won't load to Arctos.
  5. This is an iterative process; the spreadsheet does not need to be perfect (just better!), things from here will probably appear in subsequent cleanup efforts.
dustymc commented 7 years ago

TO DELETE: just map PART_NAME to some existing part.

flat rib --> rib (data will change, "flat rib" will be removed from the code table)

dustymc commented 7 years ago

The last spreadsheet is loaded (warts and all...) and it's BEAUTIFUL! https://docs.google.com/spreadsheets/d/1TV69LgIW7KErcYSDTkkxvcmECuLnh1zzdicnvSuak-g/edit?usp=sharing

Well, almost. https://docs.google.com/spreadsheets/d/1QLaF3W0VjJYG4zxTrJlpEhD0OGjBdVcbQBpcJVMVqr4/edit?usp=sharing is some obvious outliers, one mistake from the last round, and parts which contain "other" or "unknown."

Is there a preference for "other" or "unknown"? If not I'll flip a coin or something.

Help is always appreciated, but I think this is straightforward and I'll plan to finish it up and re-load tomorrow unless I hear otherwise.

DerekSikes commented 7 years ago

unknown is more universal. I prefer it.

On Tue, Jul 25, 2017 at 6:26 PM, dustymc notifications@github.com wrote:

The last spreadsheet is loaded (warts and all...) and it's BEAUTIFUL! https://docs.google.com/spreadsheets/d/1TV69LgIW7KErcYSDTkkxvcmECuLnh 1zzdicnvSuak-g/edit?usp=sharing

Well, almost. https://docs.google.com/spreadsheets/d/ 1QLaF3W0VjJYG4zxTrJlpEhD0OGjBdVcbQBpcJVMVqr4/edit?usp=sharing is some obvious outliers, one mistake from the last round, and parts which contain "other" or "unknown." Is there a preference for "other" or "unknown"? If not I'll flip a coin or something.

Help is always appreciated, but I think this is straightforward and I'll plan to finish it up and re-load tomorrow unless I hear otherwise.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1131#issuecomment-317928523, or mute the thread https://github.com/notifications/unsubscribe-auth/AIraM8bBT0CF4u2Q9wG_S50t0QrkO4kmks5sRqPZgaJpZM4NVv_A .

--

+++++++++++++++++++++++++++++++++++ Derek S. Sikes, Chief Curator, Curator of Insects Associate Professor of Entomology University of Alaska Museum 907 Yukon Drive Fairbanks, AK 99775-6960

dssikes@alaska.edu

phone: 907-474-6278 FAX: 907-474-5469

University of Alaska Museum - search 347,746 digitized arthropod records http://arctos.database.museum/uam_ento_all http://www.uaf.edu/museum/collections/ento/ +++++++++++++++++++++++++++++++++++

Interested in Alaskan Entomology? Join the Alaska Entomological Society and / or sign up for the email listserv "Alaska Entomological Network" at http://www.akentsoc.org/contact_us http://www.akentsoc.org/contact.php

campmlc commented 7 years ago

Ditto

.

dustymc commented 7 years ago

re: https://docs.google.com/spreadsheets/d/1QLaF3W0VjJYG4zxTrJlpEhD0OGjBdVcbQBpcJVMVqr4/edit#gid=1127423690

@AJLinn can you define "mount"? @KatherineLAnderson please help with flat metacarpal, %centrum(=partial vertebra??), %neural arch(=partial vertebra??) @ccicero parasitic egg? @DerekSikes pinned nest, pinned tissues??

I've eliminated "... (unknown)" - given just "egg" (etc.) I can't have any idea what you've done with it (and we're updating definitions to make that explicit), so I don't see the value in being told that the information which isn't being given isn't being given. Please let me know if I'm seeing that incorrectly.

dustymc commented 7 years ago

https://docs.google.com/spreadsheets/d/1J5tL20-_TTZgoFogq1ejZ6B3M1u1ZLOob9N6eNTo5A4/edit#gid=555921353 is parts which contain "whole" "body" or "carcass."

Despite any implications of the word "whole," can those be merged - eg, is "carcass" == "whole organism (something or another)"??

What are "body parts"? Can those all merge all that with "unknown," or are there also non-body parts we're avoiding with those terms, or ????

AJLinn commented 7 years ago

@AJLinn can you define "mount"? Done - let me know if it needs further explanation

KatherineLAnderson commented 7 years ago

I added definitions for the centra and neural arches. Please map "flat metacarpal" to metacarpal.

dustymc commented 7 years ago

Still to-do: https://docs.google.com/spreadsheets/d/1J5tL20-_TTZgoFogq1ejZ6B3M1u1ZLOob9N6eNTo5A4/edit#gid=555921353

Moving on....

These still seem unlikely:

pinned nest
pinned tissues

There are a few "visual outliers" that I keep noticing, but not sure what do to with them. (Maybe they don't need anything.)

"contents" do not sort nicely, and contains..

bill content (dry)
cheek content (dry)
crop content (70% ethanol)
egg contents (frozen)
hindgut content (70% ethanol)
stomach content

There are some random "...bone" things that probably aren't very discoverable

bone
bone (frozen)
bone marrow (frozen)
ear bone
leg bone
leg bone (frozen)
long bone

Ditto "swab"

buccal and cloacal swab (RNAlater)
cloacal swab (RNAlater)
dorsum swab
nasal swab (frozen)
oral swab
oral swab (formalin)
oral swab (frozen)

various plant-bits floating around ungrouped:

cone
seed
fruit
probably other stuff

coal ball

¯\(ツ)

MAYBE it would somehow be useful to group part-parasites (cestode, nemadode, etc.) - "parasite, cestode, bla (stuff)" is the best I can come up with...

Thoughts?