PHI-base / data

Archives of PHI-base data releases, and other data.
Creative Commons Attribution 4.0 International
7 stars 7 forks source link

elton review notes - [PHI-base/data] has 36294 reviewer note(s): 18191 found [86] column definitions, but only [66] values: assuming undefined values are empty. #6

Closed jhpoelen closed 3 years ago

jhpoelen commented 3 years ago

Hey @martin2urban -

I just stumbled across a review report from GloBI's elton and found some notes -

from https://app.travis-ci.com/github/PHI-base/data/builds/235765480 -

  _____ _       ____ _____   _____            _                
  / ____| |     |  _ \_   _| |  __ \          (_)               
 | |  __| | ___ | |_) || |   | |__) |_____   ___  _____      __ 
 | | |_ | |/ _ \|  _ < | |   |  _  // _ \ \ / / |/ _ \ \ /\ / / 
 | |__| | | (_) | |_) || |_  | | \ \  __/\ V /| |  __/\ V  V /  
  \_____|_|\___/|____/_____| |_|  \_\___| \_/ |_|\___| \_/\_/   

 | |           |  ____| | |                                     
 | |__  _   _  | |__  | | |_ ___  _ __                          
 | '_ \| | | | |  __| | | __/ _ \| '_ \                         
 | |_) | |_| | | |____| | || (_) | | | |                        
 |_.__/ \__, | |______|_|\__\___/|_| |_|                        
         __/ |                                                  

        |___/                                                   

Miller 3.4.0

s3cmd version 2.1.0
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)

elton not found... installing from [https://github.com/globalbioticinteractions/elton/releases/download/0.11.1/elton.jar]
elton version 0.11.1

Review of [local] started at [2021-08-19T13:48:53+00:00].

updating [local]... done.
creating review [local]... done.
listing interactions [local]... done.
listing taxa [local]... done.
listing nanopubs [local]... 

Review of [PHI-base/data] included:
  - 11984 interaction(s)
  - 36294 note(s)
  - 11984 info(s)

[PHI-base/data] has 36294 reviewer note(s):
  18191 found [86] column definitions, but only [66] values: assuming undefined values are empty.
   5756 target taxon name missing
    557 found malformed doi [wicker@cirad.fr]
    451 source taxon name missing
    334 found malformed doi [jinrong@purdue.edu]
    324 found malformed doi [theo.smits@zhaw.ch]
    252 found malformed doi [jan.leach@colostate.edu]
    201 found malformed doi [jplu@zju.edu.cn]
    138 found malformed doi [zhh.liu@ndsu.edu]
    118 found malformed doi [rkaur@cdfd.org.in]
    113 found malformed doi [shaw@usf.edu]
    109 found malformed doi [vlmiller@med.unc.edu]
    103 found malformed doi [jltang@gxu.edu.cn]
    102 found malformed doi [melotto@ucdavis.edu]
...

Any chance your schema changed?

martin2urban commented 3 years ago

Hi @jhpoelen, thanks for the alert. I will be reviewing the malformed DOIs in September. I accepted the recent pull request to allow GloBI to index PHI-base again. Please let me know in case of continuing issues. Issues may be caused by a change in R scripts we use to generate the data. For GLOBI I believe the species numbers are currently static. We have a limit on species numbers we curate.

jhpoelen commented 3 years ago

@martin2urban Thanks for taking the time to review the GloBI index configuration changes.

The current review notes now look like:

Review of [PHI-base/data] included:
  - 18191 interaction(s)
  - 18441 note(s)
  - 18191 info(s)

[PHI-base/data] has 18441 reviewer note(s):
  18191 found [66] columns, but only [65] columns are defined: ignoring remaining undefined columns.
    249 found malformed doi [no data found]
      1 found malformed doi [10113/23538]

Looking forward to future versions of PHi-Base and appreciate our collaboration.