biocompute-objects / BCO_Documentation

Repository for documentation to support the IEEE 2791-2020 standard. Please see our home page for communications/publications:
http://biocomputeobject.org/
BSD 3-Clause "New" or "Revised" License
16 stars 12 forks source link

Fix inconsistencies in HCV1a.json example #32

Closed stain closed 6 years ago

stain commented 6 years ago

The example HCV1a.json includes some keys not defined elsewhere (uri, sha1_chksum):

            {
                "name": "HIVE-heptagon", 
                "version": "albinoni.2",
                "uri": {
                    "address": "https://hive.biochemistry.gwu.edu/dna.cgi?cmd=dna-heptagon&cmdMode=-",
                    "access_time": "2017-01-24T09:40:17-0500",
                    "sha1_chksum": null
}

While the BCO do permit arbitrary keys for software_prerequisites the example should only use values defined in the spec.

One error_domain is listed twice, with and without spaces (and different values!):

     "false positive mutation calls discovery": "<0.0005", 
     "false_positive_mutation_calls_discovery": "<0.00005", 

Access to FTP is used without hostname, but this behavior is not defined in domain_prerequisites

            {
                "name": "access to ftp", 
                "url": "ftp://:22/"
}, 

Similarly this abstract example should be removed as this "concrete" example don't want to access the protocol protocol:

                "name": "generic name",
                "url": "protocol://domain:port/application/path"
}

Access to HIVE should presumably extend beyond the login page:

            {
                "name": "HIVE", 
                "url": "https://hive.biochemistry.gwu.edu/dna.cgi?cmd=login"
}, 

so here the URL should be chopped at first /

The script_access_type is text, yet a URI is provided for script:

        "script_access_type": "text",
        "script": ["https://example.com/workflows/antiviral_resistance_detection_hive.py"],

The script driver manual is undefined:

"script_driver": "manual",

The input/output URI examples have invalid hostname hive.biochemistry.gwu.edudata. These should either be neutral on http://example.com/ or actually work.

 "input_list": [
                        {
                            "address": "https://hive.biochemistry.gwu.edudata/514769/dnaAccessionBased.csv",
                            "access_time": "2017-01-24T09:40:17-0500"
                        }
], 

Some of the Sequence Ontology examples are missing SO: and thus don't work with http://identifiers.org/so/ according to external references expansion.

 "structured_name": "HCV1a [taxonomy:31646] ledipasvir 
       [pubchem.compound:67505836] resistance SNP 
       [so:0000694] detection",

"name": "Sequence Ontology",
"ids": ["0000048"], 

  "usability_domain": [
        "Identify baseline single nucleotide polymorphisms SNPs [SO:0000694], insertions [so:SO:0000667], and deletions [so:SO:0000045] that correlate with reduced ledipasvir [pubchem.compound:67505836] antiviral drug efficacy in Hepatitis C virus subtype 1 [taxonomy:31646]", 
],
HadleyKing commented 6 years ago