datalad / datalad-catalog

Create a user-friendly data catalog from structured metadata
https://datalad-catalog.netlify.app
MIT License
14 stars 12 forks source link

A command to populate/update gh-pages for a dataset #328

Open yarikoptic opened 1 year ago

yarikoptic commented 1 year ago

Might actually be the one creating or updating GitHub action which would update gh-pages for that dataset when pushed

jsheunis commented 1 year ago

Could you give an example to illustrate what you mean, since I'm having trouble understanding exactly what is suggested here.

Would this be e.g. if there is a datalad dataset that has a corresponding record in an existing catalog, and you want an updated metadata record to be generated and then rendered in the catalog if the datalad dataset itself is updated?

yarikoptic commented 12 months ago

target use-case: I have a dataset for which I extracted metadata (or may be even not?! that would be cool if action automatically extracted it), I push dataset to github, and GitHub action does everything necessary so that gh-pages for the dataset serves a datalad-catalog website for that dataset. create-sibling-github could gain (somehow) an option --with-gh-pages=datalad-catalog (not unlike we have create-sibling --with-ui which creates http://datasets.datalad.org/ like website) to deploy such an action upon creation of the github repo.

jsheunis commented 12 months ago

Thanks for clarifying, this makes a lot of sense. I have done something like this before here: https://github.com/jsheunis/automaticat-demo/blob/main/.github/workflows/generate_cat_entry.yml. This workflow checks the diff for every push to main, and if there's a new subdataset added or if any files changed, then it extracts metadata for those datasets/files, translates the records to the catalog schema, and adds it to the catalog which is contained in the gh-pages branch.

Your use case is slightly different in that it operates from a single dataset which is not assumed to be part of a super-sub-dataset nested structure. But this actually makes the process a bit simpler.

I like the idea of adding the option to create-sibling-github.