biocompute-objects / BCO_Documentation

Repository for documentation to support the IEEE 2791-2020 standard. Please see our home page for communications/publications:
http://biocomputeobject.org/
BSD 3-Clause "New" or "Revised" License
16 stars 12 forks source link

Allow relative URI in input_list/output_list "address" #23

Closed stain closed 5 years ago

stain commented 6 years ago

input_list (and output_list) uses keys address and access_time which are not explained.

The text says:

...expressed as a URN or URL

However Joseph Nooraga comments:

Needs clarity. Is this indicating that all data being used needs to be addressable via HTTP, and should remain so for the life of the BCO?

@rajamazumder responded:

"or a unique location in a file system"? I forgot what the discussion around this was.

I think we do need to permit relative references here, see for example Dataset_BCO_example that uses relative URIs:

"input_list":[
  "human_protein_position_pmid_id_aminoacid_glytoucan_2018_09_04_07_51_27.txt"
],

To avoid BCO parsers having to second-guess if h:/file.txt is a URI or a file location we should say that this must be an absolute URI or a relative URI reference. If we say it is always like that it means for instance that spaces in filenames are always URI escaped and have / forward slashes:

```json
"input_list":[
   "nested%20folder/file_with_50%25_percent.txt"
],

It must made clear that the relative URIs are relative to the location of the BCO JSON file and that file name must be assumed to be case-sensitive.

If this is found in D:\Submissions\bco15\bco.json then this would mean the file D:\Submissions\bco15\nested folder\file_with_50%percent.txt - or file:///d:/Submissions/bco15/nested%20folder/file_with_50%25_percent.txt as absolute (but local) file: URI

This issue relates to packaging and distribution of BCOs which is currently undefined.

stain commented 6 years ago

BTW, see my arcp paper on how to make absolute and globally unique URIs for paths within an 'archive'.