Open fititnt opened 2 years ago
Humm... Okay. We can create at least one global level "profile": "data-package-catalog"
(maybe also another for first sub level) and put as low information as possible to not make the file overly huge, since each individual 'profile': 'tabular-data-package'
(the ones which actually describe the fields) are huge with over 200 columns of data.
The simplest case of the top level can be like this
{ "profile": "data-package-catalog", "name": "1603", "resources": [ { "format": "json", "name": "1603_1_1", "path": "1603/1/1/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_1_6", "path": "1603/1/6/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_1_7", "path": "1603/1/7/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_1_51", "path": "1603/1/51/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_1_99", "path": "1603/1/99/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_1_101", "path": "1603/1/101/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_1_2020", "path": "1603/1/2020/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_1_8000", "path": "1603/1/8000/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_25_1", "path": "1603/25/1/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_44_86", "path": "1603/44/86/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_44_101", "path": "1603/44/101/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_44_111", "path": "1603/44/111/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_45_1", "path": "1603/45/1/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_45_19", "path": "1603/45/19/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_45_31", "path": "1603/45/31/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_63_101", "path": "1603/63/101/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_64_41", "path": "1603/64/41/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_64_604", "path": "1603/64/604/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_84_1", "path": "1603/84/1/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_99_876", "path": "1603/99/876/datapackage.json", "profile": "tabular-data-package" }, { "format": "json", "name": "1603_99_987", "path": "1603/99/987/datapackage.json", "profile": "tabular-data-package" } ] }
Perfect!
Now we have an MVP, both to run datapackage (at specific focused group of dictionaries AND a "profile": "data-package-catalog"
).
Its very rudimentar, but the focused ones somewhat validate with like
fititnt@bravo:/workspace/git/EticaAI/multilingual-lexicography-automation/officinam$ frictionless validate 1603/63/101/datapackage.json
# -----
# valid: 1603_63_101.no1.tm.hxl.csv
# -----
# -----
# valid: 1603_63_101.no11.tm.hxl.csv
# -----
# -----
# valid: 1603_63_101.wikiq.tm.hxl.csv
# -----
However, not surprisely, the global one fails... is complaining that we are refering to datapackages that do not exist on the disk (which is true, we need to rebuild the entire library again)
fititnt@bravo:/workspace/git/EticaAI/multilingual-lexicography-automation/officinam$ frictionless validate datapackage.json
# -------
# invalid: datapackage.json
# -------
============ ==================================================================================================================
code message
============ ==================================================================================================================
scheme-error The data source could not be successfully loaded: [Errno 2] No such file or directory: '1603/1/1/datapackage.json'
============ ==================================================================================================================
As the tittle says, let's do a minimal viable product