Closed jsheunis closed 1 month ago
Demo:
Example content of dataset_metadata.jsonl
:
{ "type": "dataset", "dataset_id": "1234", "dataset_version": "latest", "name": "Demo", "description": "This is a dataset description", "authors": [ { "name": "Stephan Heunis" }, { "name": "Michael Hanke" } ], "keywords": [ "minimal", "example", "catalog", "from", "metadata" ], "subdatasets": [ { "dataset_id": "5678", "dataset_version": "latest", "dataset_path": "mysubdataset" } ], "top_display": [ { "name": "Storage", "value": "7PB" }, { "name": "Source", "value": "Open" } ] }
{ "type": "dataset", "dataset_id": "5678", "dataset_version": "latest", "name": "Demo subdataset", "description": "This is a SUBdataset description", "authors": [ { "name": "Stephan Heunis 2" }, { "name": "Michael Hanke 2" } ], "keywords": [ "subdubdub"] }
Example content of file_metadata.jsonl
:
{ "type": "file", "dataset_id": "1234", "dataset_version": "latest", "path": "myfile.txt", "contentbytesize": 12345, "url": "https://github.com/"}
{ "type": "file", "dataset_id": "1234", "dataset_version": "latest", "path": "subdir/my2ndfile.txt", "contentbytesize": 99345, "url": "https://github.com/"}
{ "type": "file", "dataset_id": "5678", "dataset_version": "latest", "path": "mysubdatasetfile.txt", "contentbytesize": 666666, "url": "https://github.com/"}
{ "type": "file", "dataset_id": "5678", "dataset_version": "latest", "path": "subbydirry/fubar.txt", "contentbytesize": 11111111, "url": "https://github.com/"}
Run the commands:
> datalad catalog create -c Desktop/mycatalog
catalog_create(ok): Desktop/mycatalog [Catalog successfully created at: Desktop/mycatalog]
> datalad catalog add -c Desktop/mycatalog -m Desktop/dataset_metadata.jsonl
catalog_add(ok): Desktop/mycatalog [Metadata items successfully added to catalog]
> datalad catalog add -c Desktop/mycatalog -m Desktop/file_metadata.jsonl
catalog_add(ok): Desktop/mycatalog [Metadata items successfully added to catalog]
> datalad catalog set-super -c Desktop/mycatalog -i 1234 -v latest
catalog_set_super(ok): /Users/jsheunis [Superdataset successfully set for catalog]
> datalad catalog serve -c Desktop/mycatalog
...
Resulting catalog:
https://github.com/datalad/datalad-catalog/assets/10141237/5d7df906-8b10-4eb8-a964-372c3e0bfd12
Comments:
path
is path of file relative to parent dataset
We should clarify what conventions this has to be in. I assume POSIX.
I can confirm that this works for me.
This issue served its purpose a while back already. Compared to the current state in main, the commands are now outdated. Closing.
datalad catalog create -c <path-to-catalog-dir>
type
,dataset_id
,dataset_version
,name
type
isdataset
type
,dataset_id
,dataset_version
,path
type
isfile
path
is path of file relative to parent datasetdatalad catalog add -c <path-to-catalog-dir> -m <path-to-dataset-metadata-file>
datalad catalog add -c <path-to-catalog-dir> -m <path-to-file-metadata-file>
datalad catalog set-super -c <path-to-catalog-dir> -i <id-of-super> -v <version-of-super>