Closed eocarragain closed 5 years ago
That will be an error in the examples - let me know which ones and I'll fix.
On Thu, 17 Jan 2019 at 21:34, Eoghan Ó Carragáin notifications@github.com wrote:
Hi @ptsefton https://github.com/ptsefton Just looking through some of the example datasets and was surprised to see teh CATALOG_files in the data directory.
I'd have expected this to be kept outside the payload, like CATALOG.json/html and /metadata directory. It would be nice if the payload directory represented the data as arranged by the researcher with all the DataCrate and Bagit artefacts contained in the wrapper/root directory. I'd even go as far as making CATALOG_files a hidden directory like the .git folder in the root of a git repo. Also, wouldn't processing an updated CATALOG.json end up changing the files in CATALOG_files, which means you can't update the metadata without updating the payload?
I'm probably missing a good reason for this.
Cheers, Eoghan
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/UTS-eResearch/datacrate/issues/35, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuJ2FkrQ76JKnN0uA_8yNdbYq6GytjXks5vEFGngaJpZM4aE54o .
-- Peter Sefton +61410326955 pt@ptsefton.com http://ptsefton.com Gmail, Twitter & Skype name: ptsefton
Seems to be a general problem with all the examples listed in https://github.com/UTS-eResearch/datacrate/blob/master/spec/1.0/data_crate_specification_v1.0.md.
Having downloaded the zips, I see that CATALOG_files is in both the root of the datacrate and in the /data folder.
Also, most seem to have some version of the metadata spreadsheet used to generate the crate in the /data (e.g. https://data.research.uts.edu.au/examples/v1.0/Glop_Pot/data/CATALOG_Glop_Pot.xlsx). As an artefact of the packaging process, should this be outside the the payload too?
I have been through the examples and think it's fixed but will confirm before closing this.
@eocarragain Re the spreadsheets, for the moment they're staying as they're useful for re-packaging though we could add an option to Calcify to drop them on packaging. Note that we've started work on a new tool installable as a "desktop web app" which should be much easier to use than Calcify, with its clunky spreadsheets.
The zip versions still seem to be broken, e.g.:
Re the spreadsheets, yeah, I see these are more of an artefact of Calcyte, so not really an issue for the DataCrate spec. For Calcyte or similar tools, my preference would be for these intermediary files to be kept outside the /data folder (i.e. in the wrapper/root) or dropped entirely. The desktop web app sounds great. Is that being developed on Github?
The CATALOG_ files in the data directory zip downloads have now been removed. Problem was that when caclyte generated the zip files it was just adding new content, not removing stuff that was already there.
Hi @ptsefton Just looking through some of the example datasets and was surprised to see teh CATALOG_files in the data directory.
I'd have expected this to be kept outside the payload, like CATALOG.json/html and /metadata directory. It would be nice if the payload directory represented the data as arranged by the researcher with all the DataCrate and Bagit artefacts contained in the wrapper/root directory. I'd even go as far as making CATALOG_files a hidden directory like the .git folder in the root of a git repo. Also, wouldn't processing an updated CATALOG.json end up changing the files in CATALOG_files, which means you can't update the metadata without updating the payload?
I'm probably missing a good reason for this.
Cheers, Eoghan