Open vincerubinetti opened 4 days ago
That all sounds great.
At some point, I'd like to add a few things to this process if appropriate (unless they belong elsewhere)
Not sure what you had planned, but it seemed to me like maybe all the data should be updated at the same time, i.e. as part of the same gh-actions workflow. Unless you want to be able to, say, update the literature independently from the other stuff. That might muddy the waters though, like the literature would be on its own version (if it even has a version) separate from the rest of the data.
version the json data and link to it, like what I did manually here:
For reference, here's the "versioning" workflow I have for Lab Website Template. ncipollo/release-action
makes it easy to make tags and releases.
This should also appear at the top of the table and somewhere on the locus pages along with an updated date.
If what I said above is what you decide to do, there'd be one version for all of the data, and that version could be displayed in the header perhaps, with a link to the list of releases / changelog.
Instead of the "record data compile time" step I had in my example .yaml
file above, it'd actually be better if you made a GitHub CFF citation file for this repo. You probably should have this anyway. But it will also allow me to conveniently get the version and date of the data to display on the website somewhere.
Run scripts to update locus definitions (probably should generate a PR and be checked) Update literature for existing loci (scheduled, auto PR) Search for new locus literature (scheduled, auto PR)
I'd imagine all these scripts would run in sequence in this same workflow, and then the workflow could be triggered by workflow_dispatch
(running it with a manual button click in the GitHub web interface), pull_request
(when you manually open a PR for whatever reason), and schedule
(perhaps weekly or monthly). Then it would open a PR with peter-evans/create-pull-request
for you to review and merge. We could have it always open a new PR, or give a specific branch name such that if one is already open (e.g. you haven't gotten around to merging last week's update PR yet), it just updates that one.
Just a sanity check here, are all the processing scripts and such actually in this repo? I feel like I've run into cases where I ctrl+f
the whole repo, looking for a bit of Python script that generated/processed some JSON, and I can't find it.
Because all of that code will need to be in this repo on the same branch for the CI process, or else we'll need some complicated workarounds.
I believe this is what our GitHub Actions CI workflow should eventually look like:
This will allow you to either manually run the workflow and have it open a PR with the updated data, or open a PR manually and have it run the data update automatically on the PR.
As for the site, I'm thinking that we should use Netlify instead of GitHub Pages for the new website. They are both free and easy to use, but Netlify also gives us live deploy previews of PRs, built in. You can of course also set up your Netlify site to use your custom domain. And we'll have it configured such that it just rebuilds and redeploys the site when there's any changes in the repo (on main or any PR branch), including the
/data
folder. As such, there's no need to trigger anything related to the site in this gh-actions workflow; it will happen automatically. Yes this means that the site will be rebuilt when ineffectual things like the readme change, but the cost is minimal; the site only takes a few seconds to build.