BNHM / AmphibiaWebDiseasePortalAPI

API code for the AmphibiaWebDiseasePortal
https://amphibiandisease.org/
3 stars 3 forks source link

Synonomy lookups and replacement #10

Closed mkoo closed 4 years ago

mkoo commented 4 years ago

I know you just closed the taxonomy sanitizer issue @jdeck88 but I was confused with the comment that there were no synonomies at the moment. Did you mean none in AW/ Joyce's table?

There should be and maybe she needs to have a better field for them (like concatenate gaa_name, itis_names with synonymies) so not sure what you mean.

Datawise I just noticed that the portal lists Rana catesbeiana and Lithobates catesbeianus as two separate species when the latter is a synonym of the former. So yes, we should address since it has repercussions.

jdeck88 commented 4 years ago

There are plenty of synonyms in the amhib_names.json that i fetched off of the internet. However, i did not see any of the names that were in the AD portal as being in the synonymies attribute in this list. For example, for Rana catesbeiana the listing looks like this: { "order":"Anura", "family":"Ranidae", "genus":"Rana", "species":"catesbeiana", "common_name":"Bullfrog, American Bullfrog", "gaa_name":"Lithobates catesbeianus", "synonymies":"", "itis_names":"Lithobates catesbeianus", "iucn":"Least Concern (LC)", "aweb_uid":"4999", "uri_guid":"https://amphibiaweb.org/species/4999" }, There are no names listed under "synonymies"... from your comment, looks like "Lithobates catesbeianus" should be in this list.... if it WERE in the list then the script would have translated the name to Rana catesbeiana. Should we update the AW synonymy list??

On Thu, Jul 16, 2020 at 4:46 PM Michelle Koo notifications@github.com wrote:

I know you just closed the taxonomy sanitizer issue @jdeck88 https://github.com/jdeck88 but I was confused with the comment that there were no synonomies at the moment. Did you mean none in AW/ Joyce's table?

There should be and maybe she needs to have a better field for them (like concatenate gaa_name, itis_names with synonymies) so not sure what you mean.

Datawise I just noticed that the portal lists Rana catesbeiana and Lithobates catesbeianus as two separate species when the latter is a synonym of the former. So yes, we should address since it has repercussions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BNHM/AmphibiaWebDiseasePortalAPI/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIZ3RL2HV332R2QSUI67VLR36GLVANCNFSM4O5ESKNQ .

-- John Deck (541) 914-4739

mkoo commented 4 years ago

Yep, that's what I thought. When AW does a synonyms look up it uses all three fields: gaa_names, synonomies, itis_names since they come from different sources via batch updates.

So shall I ask Joyce to concatenate the other fields into synonomies?

On Thu, Jul 16, 2020 at 4:58 PM John Deck notifications@github.com wrote:

There are plenty of synonyms in the amhib_names.json that i fetched off of the internet. However, i did not see any of the names that were in the AD portal as being in the synonymies attribute in this list. For example, for Rana catesbeiana the listing looks like this: { "order":"Anura", "family":"Ranidae", "genus":"Rana", "species":"catesbeiana", "common_name":"Bullfrog, American Bullfrog", "gaa_name":"Lithobates catesbeianus", "synonymies":"", "itis_names":"Lithobates catesbeianus", "iucn":"Least Concern (LC)", "aweb_uid":"4999", "uri_guid":"https://amphibiaweb.org/species/4999" }, There are no names listed under "synonymies"... from your comment, looks like "Lithobates catesbeianus" should be in this list.... if it WERE in the list then the script would have translated the name to Rana catesbeiana. Should we update the AW synonymy list??

On Thu, Jul 16, 2020 at 4:46 PM Michelle Koo notifications@github.com wrote:

I know you just closed the taxonomy sanitizer issue @jdeck88 https://github.com/jdeck88 but I was confused with the comment that there were no synonomies at the moment. Did you mean none in AW/ Joyce's table?

There should be and maybe she needs to have a better field for them (like concatenate gaa_name, itis_names with synonymies) so not sure what you mean.

Datawise I just noticed that the portal lists Rana catesbeiana and Lithobates catesbeianus as two separate species when the latter is a synonym of the former. So yes, we should address since it has repercussions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BNHM/AmphibiaWebDiseasePortalAPI/issues/10, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAIZ3RL2HV332R2QSUI67VLR36GLVANCNFSM4O5ESKNQ

.

-- John Deck (541) 914-4739

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BNHM/AmphibiaWebDiseasePortalAPI/issues/10#issuecomment-659742554, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATH7UMUHID4CMVSLBRJLCLR36H25ANCNFSM4O5ESKNQ .

jdeck88 commented 4 years ago

oh, i see that now! To me, it seems like all synonymies should be listed under synonymies...

On Thu, Jul 16, 2020 at 5:05 PM Michelle Koo notifications@github.com wrote:

Yep, that's what I thought. When AW does a synonyms look up it uses all three fields: gaa_names, synonomies, itis_names since they come from different sources via batch updates.

So shall I ask Joyce to concatenate the other fields into synonomies?

On Thu, Jul 16, 2020 at 4:58 PM John Deck notifications@github.com wrote:

There are plenty of synonyms in the amhib_names.json that i fetched off of the internet. However, i did not see any of the names that were in the AD portal as being in the synonymies attribute in this list. For example, for Rana catesbeiana the listing looks like this: { "order":"Anura", "family":"Ranidae", "genus":"Rana", "species":"catesbeiana", "common_name":"Bullfrog, American Bullfrog", "gaa_name":"Lithobates catesbeianus", "synonymies":"", "itis_names":"Lithobates catesbeianus", "iucn":"Least Concern (LC)", "aweb_uid":"4999", "uri_guid":"https://amphibiaweb.org/species/4999" }, There are no names listed under "synonymies"... from your comment, looks like "Lithobates catesbeianus" should be in this list.... if it WERE in the list then the script would have translated the name to Rana catesbeiana. Should we update the AW synonymy list??

On Thu, Jul 16, 2020 at 4:46 PM Michelle Koo notifications@github.com wrote:

I know you just closed the taxonomy sanitizer issue @jdeck88 https://github.com/jdeck88 but I was confused with the comment that there were no synonomies at the moment. Did you mean none in AW/ Joyce's table?

There should be and maybe she needs to have a better field for them (like concatenate gaa_name, itis_names with synonymies) so not sure what you mean.

Datawise I just noticed that the portal lists Rana catesbeiana and Lithobates catesbeianus as two separate species when the latter is a synonym of the former. So yes, we should address since it has repercussions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BNHM/AmphibiaWebDiseasePortalAPI/issues/10, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AAIZ3RL2HV332R2QSUI67VLR36GLVANCNFSM4O5ESKNQ

.

-- John Deck (541) 914-4739

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/BNHM/AmphibiaWebDiseasePortalAPI/issues/10#issuecomment-659742554 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AATH7UMUHID4CMVSLBRJLCLR36H25ANCNFSM4O5ESKNQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BNHM/AmphibiaWebDiseasePortalAPI/issues/10#issuecomment-659744932, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIZ3RMKVLEXHDNUPQOJXKTR36IVHANCNFSM4O5ESKNQ .

-- John Deck (541) 914-4739

JoyceGross commented 4 years ago

I can't think of a good reason not to put all the synonymies in the synonymies field. Historically, we were going to use that field just for synonymies that we added ourselves, and to keep track of the source of our synonymies. The itis_names were harvested from ITIS (whose amphibian names are based on ASW). We are not up-to-date with what's on ITIS. The gaa_names (IUCN names) are now completely out of date and we have no immediate possibility of updating those names. It would be cleaner to store everything in the synonymies field -- I don't think anyone has been caring what the source of our synonymies are. I guess if we harvest again from ITIS (to see if we can pick up missing synonymies) we can still use the itis_names field just for that, and copy any new synonymies to the synonymies field (or just skip using the itis_names field if no one cares about it).

jdeck88 commented 4 years ago

ok great, let me know when you can update the amphib_names.json file and i can update the AD portal fetch script.

jdeck88 commented 4 years ago

This issue is fixed in this code-base... waiting on updated amphib_names.json as well as implementation of #9

JoyceGross commented 4 years ago

https://amphibiaweb.org/amphib_names.json has all the synonymies combined into one field now, the "synonymies" field. I removed "itis_names" and "gaa_name" from the json file. I made sure the names were just listed once, since they were sometimes repeated in the three fields (and sometimes a "synonymy" was the same as the accepted name). That was a little messy to sort out. I guess the data will get messy again as we don't do a great job of updating synonymies when names are changed. That's a separate not-currently-important issue.

jdeck88 commented 4 years ago

i've downloaded that file and re-ran the script to build the API.. this changed alot of files. Attaching the diff on the scientifnames_listing.json to show what has changed, name-wise.

On Thu, Jul 23, 2020 at 5:58 PM JoyceGross notifications@github.com wrote:

https://amphibiaweb.org/amphib_names.json has all the synonymies combined into one field now, the "synonymies" field. I removed "itis_names" and "gaa_name" from the json file. I made sure the names were just listed once, since they were sometimes repeated in the three fields (and sometimes a "synonymy" was the same as the accepted name). That was a little messy to sort out. I guess the data will get messy again as we don't do a great job of updating synonymies when names are changed. That's a separate not-currently-important issue.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/BNHM/AmphibiaWebDiseasePortalAPI/issues/10#issuecomment-663302907, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIZ3RLL3IUO6X6MSMNLG5LR5DMC5ANCNFSM4O5ESKNQ .

-- John Deck (541) 914-4739

diff --git a/data/scientificName_listing.json b/data/scientificName_listing.json index 36fb15c..e2cd830 100644 --- a/data/scientificName_listing.json +++ b/data/scientificName_listing.json @@ -31,8 +31,8 @@ {"scientificName" : "Amphiuma means" , "order" : "Caudata" , "family" : "Amphiumidae", "associatedProjects" : [{"projectId" : "292" , "count" : 2}]}, {"scientificName" : "Amphiuma tridactylum" , "order" : "Caudata" , "family" : "Amphiumidae", "associatedProjects" : [{"projectId" : "292" , "count" : 10}]}, {"scientificName" : "Anaxyrus americanus" , "order" : "Anura" , "family" : "Bufonidae", "associatedProjects" : [{"projectId" : "246" , "count" : 96}]},