globalbioticinteractions / msb-para

0 stars 0 forks source link

find a way to resolve [National Museum of Natural History 1392647] to related USNM record [http://n2t.net/ark:/65665/3aeecedcb-1cc4-4936-8aae-eb3b2b555e46] #5

Open jhpoelen opened 3 years ago

jhpoelen commented 3 years ago

In a recent msb-para review we found unresolved occurrence id references like:

      1 found unresolved reference [National Museum of Natural History 77651]
      1 found unresolved reference [National Museum of Natural History 603447]
      1 found unresolved reference [National Museum of Natural History 602540]
      1 found unresolved reference [National Museum of Natural History 298475]
      1 found unresolved reference [National Museum of Natural History 1392647]
      1 found unresolved reference [National Museum of Natural History 1390579]
      1 found unresolved reference [National Museum of Natural History 1390578]
      1 found unresolved reference [National Museum of Natural History 1390577]
      1 found unresolved reference [National Museum of Natural History 1376652]
      1 found unresolved reference [National Museum of Natural History 1376651]
      1 found unresolved reference [National Museum of Natural History 1375891]
      1 found unresolved reference [National Museum of Natural History 1375890]
      1 found unresolved reference [National Museum of Natural History 1375518]
      1 found unresolved reference [National Museum of Natural History 1375517]
      1 found unresolved reference [National Museum of Natural History 1375516]
      1 found unresolved reference [National Museum of Natural History 1375515]
      1 found unresolved reference [National Museum of Natural History 1375514]
      1 found unresolved reference [National Museum of Natural History 1375391]
      1 found unresolved reference [National Museum of Natural History 1375390]
      1 found unresolved reference [National Museum of Natural History 1375389]
      1 found unresolved reference [National Museum of Natural History 1375387]
      1 found unresolved reference [National Museum of Natural History 1373126]
      1 found unresolved reference [National Museum of Natural History 1373122]
      1 found unresolved reference [National Museum of Natural History 1373121]
      1 found unresolved reference [National Museum of Natural History 1373120]
      1 found unresolved reference [National Museum of Natural History 1367425]
      1 found unresolved reference [National Museum of Natural History 1348186]

However, the USNM uses ark ids for their occurrence ids. Similar to #4, try to find way to infer a record reference used in Arctos like National Museum of Natural History 1392647.

Note, however, that when querying the "raw" dwca USNM record, two hits show up when searching for 1392647 - on in the botany collection, and the other in the invertebrate collection.

so, more criteria are needed to make an unambiguous link.

fyi @dustymc @campmlc

$ zipgrep -E "\t1392647\t" 60c10aea7c4dfdfd81e078541a271ceb379ab8605af819f90cd6882e8712d12e 
occurrence.txt:http://n2t.net/ark:/65665/3aeecedcb-1cc4-4936-8aae-eb3b2b555e46  PhysicalObject      http://biocol.org/urn:lsid:biocol.org:col:34871 USNM    Invertebrate Zoology    NMNH Extant Biology PreservedSpecimen   http://n2t.net/ark:/65665/3aeecedcb-1cc4-4936-8aae-eb3b2b555e46 1392647 Original USNPC preservative was a solution of 70% ethanol, 3% formalin, and 2% glycerine {"hostGen":"Erignathus","hostSpec":"barbatus","hostBodyLoc":"sm intestine","hostFldNo":"RLRausch-22610"}   USNPC # 097644  R. Rausch & V. Rausch   1       Alcohol (Ethanol)                   18  18  1959    1   18                      North America, United States, Alaska    North America               United States   Alaska      Gambell (St. Lawrence Isl)      63.7797 -171.741    WGS84               Rausch, R. L.           Diphyllobothrium cordatum   Animalia, Platyhelminthes, Cestoda, Diphyllobothriidea, Diphyllobothriidae  Animalia    Platyhelminthes Cestoda Diphyllobothriidea  Diphyllobothriidae  Diphyllobothrium        cordatum            (Leuckart)
occurrence.txt:http://n2t.net/ark:/65665/3f7b0519b-4ff1-4160-b196-16fc72b83d60  PhysicalObject      http://biocol.org/urn:lsid:biocol.org:col:34871 US  Botany  NMNH Extant Biology PreservedSpecimen   http://n2t.net/ark:/65665/3f7b0519b-4ff1-4160-b196-16fc72b83d60 1392647     s.n.    W. C. Cusick    1                   14073988                        188                 North America, United States, Oregon    North America               United States   Oregon                                                                      Cusickiella douglasii   Plantae, Dicotyledonae, Capparales, Brassicaceae    Plantae     Dicotyledonae   Capparales  BrassicaceaeCusickiella     douglasii           (A. Gray) Rollins
campmlc commented 3 years ago

All of the MSB:Para references will be to the invertebrate collection, assuming the relationship is "same individual as". Is it possible to get a download of the Globi data showing both 1 "found unresolved reference [National Museum of Natural History 77651]" and the Arctos source guid url from MSB:Para? Or the MSB:Para catalog number? The file we have does not allow us to quickly find the referenced record.

On Thu, May 27, 2021 at 6:22 PM Jorrit Poelen @.***> wrote:

  • [EXTERNAL]*

In a recent msb-para review we found unresolved occurrence id references like:

  1 found unresolved reference [National Museum of Natural History 77651]
  1 found unresolved reference [National Museum of Natural History 603447]
  1 found unresolved reference [National Museum of Natural History 602540]
  1 found unresolved reference [National Museum of Natural History 298475]
  1 found unresolved reference [National Museum of Natural History 1392647]
  1 found unresolved reference [National Museum of Natural History 1390579]
  1 found unresolved reference [National Museum of Natural History 1390578]
  1 found unresolved reference [National Museum of Natural History 1390577]
  1 found unresolved reference [National Museum of Natural History 1376652]
  1 found unresolved reference [National Museum of Natural History 1376651]
  1 found unresolved reference [National Museum of Natural History 1375891]
  1 found unresolved reference [National Museum of Natural History 1375890]
  1 found unresolved reference [National Museum of Natural History 1375518]
  1 found unresolved reference [National Museum of Natural History 1375517]
  1 found unresolved reference [National Museum of Natural History 1375516]
  1 found unresolved reference [National Museum of Natural History 1375515]
  1 found unresolved reference [National Museum of Natural History 1375514]
  1 found unresolved reference [National Museum of Natural History 1375391]
  1 found unresolved reference [National Museum of Natural History 1375390]
  1 found unresolved reference [National Museum of Natural History 1375389]
  1 found unresolved reference [National Museum of Natural History 1375387]
  1 found unresolved reference [National Museum of Natural History 1373126]
  1 found unresolved reference [National Museum of Natural History 1373122]
  1 found unresolved reference [National Museum of Natural History 1373121]
  1 found unresolved reference [National Museum of Natural History 1373120]
  1 found unresolved reference [National Museum of Natural History 1367425]
  1 found unresolved reference [National Museum of Natural History 1348186]

However, the USNM uses ark ids for their occurrence ids. Similar to #4 https://github.com/globalbioticinteractions/msb-para/issues/4, try to find way to infer a record reference used in Arctos like National Museum of Natural History 1392647.

Note, however, that when querying the "raw" dwca USNM record, two hits show up when searching for 1392647 - on in the botany collection, and the other in the invertebrate collection.

so, more criteria are needed to make an unambiguous link.

fyi @dustymc https://github.com/dustymc @campmlc https://github.com/campmlc

$ zipgrep -E "\t1392647\t" 60c10aea7c4dfdfd81e078541a271ceb379ab8605af819f90cd6882e8712d12e occurrence.txt:http://n2t.net/ark:/65665/3aeecedcb-1cc4-4936-8aae-eb3b2b555e46 PhysicalObject http://biocol.org/urn:lsid:biocol.org:col:34871 USNM Invertebrate Zoology NMNH Extant Biology PreservedSpecimen http://n2t.net/ark:/65665/3aeecedcb-1cc4-4936-8aae-eb3b2b555e46 1392647 Original USNPC preservative was a solution of 70% ethanol, 3% formalin, and 2% glycerine {"hostGen":"Erignathus","hostSpec":"barbatus","hostBodyLoc":"sm intestine","hostFldNo":"RLRausch-22610"} USNPC # 097644 R. Rausch & V. Rausch 1 Alcohol (Ethanol) 18 18 1959 1 18 North America, United States, Alaska North America United States Alaska Gambell (St. Lawrence Isl) 63.7797 -171.741 WGS84 Rausch, R. L. Diphyllobothrium cordatum Animalia, Platyhelminthes, Cestoda, Diphyllobothriidea, Diphyllobothriidae Animalia Platyhelminthes Cestoda Diphyllobothriidea Diphyllobothriidae Diphyllobothrium cordatum (Leuckart) occurrence.txt:http://n2t.net/ark:/65665/3f7b0519b-4ff1-4160-b196-16fc72b83d60 PhysicalObject http://biocol.org/urn:lsid:biocol.org:col:34871 US Botany NMNH Extant Biology PreservedSpecimen http://n2t.net/ark:/65665/3f7b0519b-4ff1-4160-b196-16fc72b83d60 1392647 s.n. W. C. Cusick 1 14073988 188 North America, United States, Oregon North America United States Oregon Cusickiella douglasii Plantae, Dicotyledonae, Capparales, Brassicaceae Plantae Dicotyledonae Capparales BrassicaceaeCusickiella douglasii (A. Gray) Rollins

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/globalbioticinteractions/msb-para/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBCMTORWLWTODXXMB73TP3O2JANCNFSM45VMGQYQ .

jhpoelen commented 3 years ago

@campmlc you should be able to find some of them (relevant to GloBI) in:

https://depot.globalbioticinteractions.org/reviews/globalbioticinteractions/msb-para/indexed-interactions.tsv or https://depot.globalbioticinteractions.org/reviews/globalbioticinteractions/msb-para/indexed-interactions.csv (if you like csv files)

for instance, here's a list of the first 10 interactions that involve some unresolved National Museum of Natural History ... occurrence reference:

$ curl "https://depot.globalbioticinteractions.org/reviews/globalbioticinteractions/msb-para/indexed-interactions.tsv.gz" | gunzip | grep -P "National Museum of Natural History [0-9]+" | head
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28910?seid=4276070  MSB:Para:28910  Parasite specimens  27  MSB     Arostrilepis gardneri           Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis gardneri   kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28910   http://arctos.database.museum/guid/MSB:Para:28910   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28908?seid=4276068  MSB:Para:28908  Parasite specimens  27  MSB     Arostrilepis gardneri           Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis gardneri   kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28908   http://arctos.database.museum/guid/MSB:Para:28908   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28918?seid=4276293  MSB:Para:28918  Parasite specimens  27  MSB     Arostrilepis insperata          Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis insperata  kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28918   http://arctos.database.museum/guid/MSB:Para:28918   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:5877?seid=1873192   MSB:Para:5877   Parasite specimens  27  MSB     Mesocestoides           Animalia | Platyhelminthes | Cestoda | Cyclophyllidea | Mesocestoididae | Mesocestoides kingdom | phylum | class | order | family | genus                           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 298475                       National Museum of Natural History 298475                                               PreservedSpecimen   1954-08-27T00:00:00Z    60.384454   -172.656589     Old quonset, SE valley, St. Matthew Island      http://arctos.database.museum/guid/MSB:Para:5877    http://arctos.database.museum/guid/MSB:Para:5877    globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28912?seid=4276072  MSB:Para:28912  Parasite specimens  27  MSB     Arostrilepis gardneri           Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis gardneri   kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28912   http://arctos.database.museum/guid/MSB:Para:28912   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28906?seid=4276066  MSB:Para:28906  Parasite specimens  27  MSB     Arostrilepis gardneri           Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis gardneri   kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28906   http://arctos.database.museum/guid/MSB:Para:28906   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28907?seid=4276067  MSB:Para:28907  Parasite specimens  27  MSB     Arostrilepis gardneri           Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis gardneri   kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28907   http://arctos.database.museum/guid/MSB:Para:28907   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28919?seid=4279501  MSB:Para:28919  Parasite specimens  27  MSB     Arostrilepis insperata          Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis insperata  kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28919   http://arctos.database.museum/guid/MSB:Para:28919   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28911?seid=4276071  MSB:Para:28911  Parasite specimens  27  MSB     Arostrilepis gardneri           Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis gardneri   kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28911   http://arctos.database.museum/guid/MSB:Para:28911   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12
https://en.wiktionary.org/wiki/support  http://arctos.database.museum/guid/MSB:Para:28909?seid=4276069  MSB:Para:28909  Parasite specimens  27  MSB     Arostrilepis gardneri           Metazoa | Platyhelminthes | Cestoda | Cyclophyllidea | Hymenolepididae | Arostrilepis | Arostrilepis gardneri   kingdom | phylum | class | order | family | genus | species             adult           http://purl.obolibrary.org/obo/RO_0002444   parasiteOf  National Museum of Natural History 602540                       National Museum of Natural History 602540                                               PreservedSpecimen   2016-08-26T00:00:00Z    38.4419 -79.7174666667      Monogehela National Forest, Elleber Sods Road (US Forest Service Rd. 1681) 3282 ft elevation        http://arctos.database.museum/guid/MSB:Para:28909   http://arctos.database.museum/guid/MSB:Para:28909   globalbioticinteractions/msb-para   MSB Parasite Collection (Arctos)    http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para   2021-05-17T21:12:15.073Z    625167a97ffcbf26472e83eb5fbaceb05f86c1455b0aa94da8571de47bf85a44    0.10.12