CODAIT / exchange-metadata-converter

Basic conversion utility for YAML-based metadata descriptors
Apache License 2.0
1 stars 4 forks source link

Determine way to flatten ORSD archive metadata & check to see if any other DAX archives have similarly complex nested structures #3

Open edwardleardi opened 4 years ago

edwardleardi commented 4 years ago

Current proposal is to release a new version of ORSD which no longer has nested archives. Then we can use a structure like:

content:
  - file_name: data/SPE9-TRIANGLE.Aspect1/test
     ...
  - file_name: data/SPE9-TRIANGLE.Aspect1/train
     ...
  - file_name: data/SPE9-TRIANGLE.Aspect2/json_test
     ...
  - file_name: data/SPE9-TRIANGLE.Aspect2/json_train 
     ...
  - file_name: data/SPE9-TRIANGLE.Aspect3.compressed.h5
     ...
  - file_name: data/SPE9-MAX.Aspect1
     ...
  - file_name: data/SPE9-MAX.Aspect2
     ...
  - file_name: data/SPE9-MAX.Aspect3.compressed.h5
     ...

Keeping in mind, the archive level description field for the dataset will need to describe the content composition, e.g. "...contains two versions of the dataset, SPE9-TRIANGLE which... and SPE9-MAX which..."

MLnick commented 4 years ago

I'd agree that an updated version without the nested archives and using subdirectory structures instead would be best