globalbioticinteractions / prestonocene

Prestonocene: DwC-Archive Interaction Indexing
GNU General Public License v3.0
0 stars 0 forks source link

suggest to manually review interaction records extracted for tick and bee families Ixodidae and Apidea resp. #5

Open jhpoelen opened 2 years ago

jhpoelen commented 2 years ago

@seltmann suggested to manually review interaction records extracted for Ixodidae family and Apidea to get a sense for how many interaction types are mapped to a catch-all like interactsWith where more specific information about the kind of interaction is explicitly expressed in the records.

jhpoelen commented 2 years ago

suggest to limit the interactions to those including the families.

Then, randomly selected ~ 100 known records and manually inspect the results.

The randomized selects would help to reduce selection bias.

jhpoelen commented 2 years ago

note the posix tool "shuf" might be of use when creating random sub samples.

NAME
shuf - generate random permutations

SYNOPSIS
shuf [OPTION]... [FILE]
shuf -e [OPTION]... [ARG]...
shuf -i LO-HI [OPTION]...

DESCRIPTION
Write a random permutation of the input lines to standard output.

With no FILE, or when FILE is -, read standard input.

Mandatory arguments to long options are mandatory for short options
too. 
[...]
zedomel commented 2 years ago

Ixodidae

cat reviews.txt.bz2 | bunzip2 | grep context | cut -f15 | jq -r '.context | try select(.sourceTaxonPath | test("Ixodidae"; "g")) | [.archiveURI,."http://rs.tdwg.org/dwc/terms/associatedTaxa" // "", ."http://rs.tdwg.org/dwc/terms/family" // "", .resourceTypes // "", .interactionTypeName // "", .interactionTypeNameVerbatim // ""] | @tsv' | cut -f5,6 | sort | uniq -c | sort -nr

8282 hasHost    hasHost
   4168 hasHost Host
   2796 interactsWith   
   1444 parasiteOf  (parasite of)
   1442 (parasite of)   
   1313 hasHost host
    297 hasHost ex
    261 has host    
    208 Huésped 
    208 Ectoparásito    
    198 Ectoparasito de 
    125 adjacentTo  on
     99 Parásito de 
     75 from    
     70 adjacentTo  found on
     66 hasHost HOST
     28 (collected with)    
     18 sex 
     13 From    
     10 adjacentTo  On
      4 associates with 
      4 adjacentTo  Found on
      3 Host of 
      3 hasHost Ex
      3 Aracnido de 
      2 interactsWith   Associated with
      2 found in association with   
      1 wandoo woodland. Ex 
      1 open woodland near ocean. Ex    
      1 MV  
      1 monkey  
      1 in wandoo woodland. Ex  
      1 interactsWith   associated with
      1 hasHost EX
      1 ectoparasiteOf  Ectoparasite of

Apidae

cat reviews.txt.bz2 | bunzip2 | grep context | cut -f15 | jq -r '.context | try select(.sourceTaxonPath | test("Apidae"; "g")) | [.archiveURI,."http://rs.tdwg.org/dwc/terms/associatedTaxa" // "", ."http://rs.tdwg.org/dwc/terms/family" // "", .resourceTypes // "", .interactionTypeName // "", .interactionTypeNameVerbatim // ""] | @tsv' | cut -f5,6 | sort | uniq -c | sort -nr

86461 interactsWith 
  21455 interactsWith   associated with
  18819 Collected from  
   7770 foraging on 
   5073 hasHost host
   3035 Visits flowers of   
   2474 Visitante floral    
   1510 Hospedero   
   1300 Visitante floral de 
   1194 Huesped 
    596 Mutualismo  
    510 adjacentTo  on
    476 hasHost ex
    422 adjacentTo  On
    235     
    137 Nests in    
    131 Parasite    
    120 (collected with)    
    110 has food plant  
    107 visited flower of   
    100 Recurso floral  
     89 hasHost Host
     56 flower  
     43 from    
     24 En  
     18 Aliméntandose de    
     16 visiting    
     12 HostSpecies 
     12 has host    
     10 PreviousCrop    
     10 or when the temperature exceeded 15°C. Peak visits occurred between 8   
     10 interactsWith   interactsWith
      9 Consume flor de 
      8 Huésped 
      7 visitsFlowersOf visitsFlowersOf
      5 Visiting    
      5 in relation with    
      5 floral host 
      5 Consume polen de    
      5 Consume néctar de   
      4 (same lot as)   
      4 Rep 
      4 From    
      4 adjacentTo  Found on
      3 resting on  
      3 Ataca la madera de  
      2 <p>The pollen collecting behavior and foraging activity of this solitary bee on <em>Salvia bogotensis </em>was studied by Gonzalez et al. (2006) in the Eastern Andes of Colombia. Bees foraged from 7  
      2 (host of)   
      2 host of 
      2 hasParasite (host of)
      2 hasHost Ex
      2 flower color    
      2 feeding on  
      2 collected from  
      2 adjacentTo  Collected on
      2 adjacentTo  collected on
      1 under bark  
      1 prairie restoration 
      1 Polinizador 
      1 Peri-Urban  
      1 Parasitado por  
      1 nectar plant    
      1 Hospedante  
      1 Feeding on  
      1 Consume néctar y polen de   
      1 Busy bees in the almost by the hundreds  collecting pollen from a large variety of blooms from 8    
      1 adjacentTo  found on

Random

cat reviews.txt.bz2 | bunzip2 | grep context | cut -f15 | shuf -n 100 | jq -r '.context | try select(.sourceTaxonPath | test("Apidae"; "g")) | [.archiveURI,."http://rs.tdwg.org/dwc/terms/associatedTaxa" // "", ."http://rs.tdwg.org/dwc/terms/family" // "", .resourceTypes // "", .interactionTypeName // "", .interactionTypeNameVerbatim // ""] | @tsv' | cut -f5,6 | sort | uniq -c | sort -nr