globalbioticinteractions / mycoportal

configuration for indexing MyCoPortal via Symbiota RSS feed
0 stars 0 forks source link

sourceBasisOfRecordName #1

Open seltmann opened 3 years ago

seltmann commented 3 years ago

@jhpoelen Please consider updating the GloBI import to include a sourceBasisOfRecordName. Presently 1,278,142 records from mycoportal.org do not include BasisOfRecord information. This would be useful for understanding what records in GloBI are coming from PreservedSpecimens or HumanObservations. Thanks!

jhpoelen commented 3 years ago

@seltmann thanks for sharing your suggestion.

I produced the following table with the mycoportal data generously provided by @stbates @thebateslab for indexing by GloBI. I believe the dataset was a temporary substitute while maintenance was being done. As you can see, basis of record was not (yet) included in this data file.

If you have ideas for a better source of mycoportal data, please do share.

via

$ curl -L "https://github.com/globalbioticinteractions/mycoportal/raw/main/tmp_host_info-20200902.csv.gz"\
 | gunzip\
 | head\
 | mlr --icsv --omd cat

the following markdown table was generated:

occid sciname country locality substrate host associatedtaxa catalogNumber collid
3042414 United States Populus balsamifera CBRU00003514 54
3043800 Philippines on N slope of Mt. Apo Coleoptera: Staphylinidae SYRF0007954 33
3043970 RMS0004987 35
3044320 RMS0005073 35
3050053 RMS0005366 35
2854245 United States Greenhouse Solanum tuberosum CUP-029072 23
2854248 United States Pl. Path. Greenhouse Lilium longiflorum giganteum CUP-021618 23
2854252 United States potato tuber Solanum tuberosum CUP-029215 23
2854256 Solanum tuberosum CUP-012554 23
jhpoelen commented 9 months ago

Now that the MycoPortal is back online, I switched to using the mycoportal rss feed instead of the intermediate data dump provided earlier.

With this, the basisOfRecord values are showing up.

For example, see 9 associations below extracted using:

elton update globalbioticinteractions/mycoportal 
elton interactions | head | mlr --itsvlite --oxtab cat 

yielding

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548784
sourceCatalogNumber                    76001
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          85742
sourceTaxonName                        Aecidium alstroemeriae
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Pucciniaceae | Aecidium | Aecidium alstroemeriae
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Alstroemeria sp.
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1919-09-18T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Papudo
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582868
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582868
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548785
sourceCatalogNumber                    6428
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          
sourceTaxonName                        Aecidium borricheae
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        
sourceTaxonPathNames                   
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Borrichia frutescens
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1901-04-27T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Fort Morgan
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582879
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582879
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548786
sourceCatalogNumber                    3912
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          85719
sourceTaxonName                        Aecidium ageratinae
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Pucciniaceae | Aecidium | Aecidium ageratinae
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Ageratina altissima
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1936-06-23T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Abbotsford
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582850
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582850
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548787
sourceCatalogNumber                    76012
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          85997
sourceTaxonName                        Aecidium chuquiraguae
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Pucciniaceae | Aecidium | Aecidium chuquiraguae
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Chuquiragua sp.
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1922-04-20T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Campos do Jordao
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582887
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582887
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548788
sourceCatalogNumber                    76113
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          102660
sourceTaxonName                        Arthuria catenulata
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Phakopsoraceae | Arthuria | Arthuria catenulata
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Croton? sp.
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1921-08-17T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Paineiras
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11583123
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11583123
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548789
sourceCatalogNumber                    24393
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          86400
sourceTaxonName                        Aecidium hornotinum
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Pucciniaceae | Aecidium | Aecidium hornotinum
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Meliosma affin. multiflora
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1935-03-26T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Mt. Sant Tomas
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582941
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582941
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548789
sourceCatalogNumber                    24393
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          86400
sourceTaxonName                        Aecidium hornotinum
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Pucciniaceae | Aecidium | Aecidium hornotinum
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002220
interactionTypeName                    adjacentTo
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        leaves
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1935-03-26T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Mt. Sant Tomas
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582941
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582941
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548790
sourceCatalogNumber                    20858
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          86468
sourceTaxonName                        Aecidium ivae
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Pucciniaceae | Aecidium | Aecidium ivae
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Iva frutescens
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1907-08-14T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Lewes
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582962
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582962
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11

argumentTypeId                         https://en.wiktionary.org/wiki/support
sourceOccurrenceId                     02-01000548791
sourceCatalogNumber                    19341
sourceCollectionCode                   DAOM
sourceCollectionId                     88a2a8f6-98e7-4a34-8d2e-238ff4cff80a
sourceInstitutionCode                  AAFC-AAC
sourceTaxonId                          86539
sourceTaxonName                        Aecidium leonense
sourceTaxonRank                        
sourceTaxonPathIds                     
sourceTaxonPath                        Fungi | Basidiomycota | Pucciniomycetes | Pucciniales | Pucciniaceae | Aecidium | Aecidium leonense
sourceTaxonPathNames                   kingdom | phylum | class | order | family | genus | species
sourceBodyPartId                       
sourceBodyPartName                     
sourceLifeStageId                      
sourceLifeStageName                    
sourceSexId                            
sourceSexName                          
interactionTypeId                      http://purl.obolibrary.org/obo/RO_0002454
interactionTypeName                    hasHost
targetOccurrenceId                     
targetCatalogNumber                    
targetCollectionCode                   
targetCollectionId                     
targetInstitutionCode                  
targetTaxonId                          
targetTaxonName                        Dioscorea sp
targetTaxonRank                        
targetTaxonPathIds                     
targetTaxonPath                        
targetTaxonPathNames                   
targetBodyPartId                       
targetBodyPartName                     
targetLifeStageId                      
targetLifeStageName                    
targetSexId                            
targetSexName                          
basisOfRecordId                        
basisOfRecordName                      PreservedSpecimen
http://rs.tdwg.org/dwc/terms/eventDate 1940-04-26T00:00:00Z
decimalLatitude                        
decimalLongitude                       
localityId                             
localityName                           Sembehun
referenceDoi                           
referenceUrl                           https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582967
referenceCitation                      https://www.mycoportal.org/portal/collections/individual/index.php?occid=11582967
namespace                              local
citation                               AAFC-AAC-DAOM DwC-Archive | Darwin Core Archive for Canadian National Mycological Herbarium
archiveURI                             https://www.mycoportal.org/portal/content/dwca/AAFC-AAC-DAOM_DwC-A.zip
lastSeenAt                             
contentHash                            
eltonVersion                           0.12.11