NCEAS / sasap-training

Training agendas and materials for open science tools for SASAP
https://nceas.github.io/sasap-training
8 stars 6 forks source link

build training versions off of branches #25

Closed jeanetteclark closed 5 years ago

jeanetteclark commented 5 years ago

currently we build off of tagged releases. The tagged releases are helpful because it tells us what the material was at the time we taught it, but it is problematic to build off of them because underlying library changes may cause a build to fail later down the line (this has happened once already).

solution is to build off of the head of branches instead, where each branch represents a training. A tag will still represent what the material was when we taught it, but if we need to make minor updates to fix a failing build due to a library change, this enables us to do so.

steps: create branches for Juneau (at tag v1.0) and Fairbanks (at head of master) rewrite build script

Jared and I talked about how in the future it would be nice to clean up the content on each of the branches. The master branch should just have the build script and the non-bookdown webpage content, the training branches should just have the bookdown training material content. Future trainings could branch off of existing training branches depending on which agenda that future training follows most closely. This way we won't need to merge branches back into master.

@amoeba do you have thoughts on this?

amoeba commented 5 years ago

Yes! I hadn't thought of this problem.

Your idea would totally work. The only thing I don't love is this use of branches where branches are being created that are never intended to be merged back into master. We do this a lot with GitHub Pages sites and it's kinda hacky. I think, in this case, it'd reduce discoverability but making it explicit in the README would help.

What do you think about splitting the website and the materials into multiple repos at some point? The individual Bookdown repos for each set of materials would be a stand-alone project and the main Blogdown repo could include them in the build script or via Git submodules.

jeanetteclark commented 5 years ago

yeah @jkibele and I briefly talked about that as well. would each training get it's own repo do you think? On a small scale, I think that would work, but thinking in the larger view, if we run a whole lot of trainings we are going to have a proliferation of repos that are not explicitly connected. I don't think we have a good solution yet for how to effectively build upon training materials both within and between projects (SASAP, ADC).

I hadn't thought about the discoverability issue...if we removed all of the materials from the master branch and just had them reside on the training branches it would for sure prevent us from easily showing the markdown documents to training participants - which we like to be able to do. I guess we could merge training branches back into master when we make the training tags, which keeps the most recent tagged training material in the master branches, but we would still have unmerged branches if we need to fix anything.

Using tags alone is definitely problematic for the reasons outlined above, branches are kind of hacky I agree and I think I've almost talked myself out of them because of the discoverability issue. Using submodules sounds like a good solution for our build issues and gets us around the hacky branches, but I don't think it captures the relatedness of the trainings in the same way that branches do. For example, if I were to hold another SASAP training and we moved to submodules, I'm guessing I would need to copy most of the materials from the previous training and initialize it as a new git repo - thus breaking the version chain between the two trainings. This is basically what we are doing now as we develop trainings between different projects...I just really want there to be a better way with more centralized material.

amoeba commented 5 years ago

would each training get it's own repo do you think?

yeah

I don't think we have a good solution yet for how to effectively build upon training materials both within and between projects (SASAP, ADC).

yep. I think a solution this is to stop doing that. Maintaining parallel curricula doesn't scale well. Any chance PIs on the projects might go for it?

For example, if I were to hold another SASAP training and we moved to submodules, I'm guessing I would need to copy most of the materials from the previous training and initialize it as a new git repo - thus breaking the version chain between the two trainings.

yep, not great practice at all

Altogether, I think we're feeling the stress of how I tied things together to start off with. Do you think we're reaching a point where building the whole thing via Travis CI is still working well? Is it saving headaches or causing them?

jeanetteclark commented 5 years ago

I think we are quickly approaching the point where our current build system isn't working well but I think it has more to do with the number of trainings we are running (which will hopefully increase!) and less with Travis CI. It really doesn't make sense to copy a Rmd from one repo to another when we run a new training, and then we have multiple versions of that Rmd floating around.

I like your suggestion of submodules:

The individual Bookdown repos for each set of materials would be a stand-alone project and the main Blogdown repo could include them in the build script

What if we flip this idea on it's head? Training-event repos with blogdown material and a build script that accesses a primary NCEAS training materials repository as a submodule. Build scripts in the training-event repos pluck out the desired chapters, build the book, and copies the material onto the blogdown site. The primary training materials repo would have all the material we ever use, continually updated and built upon but always rendered elsewhere.

There are definitely drawbacks to this...there would probably be a LOT of lessons in that central repository, different durations, project specific styles, etc. but I think it might be more easily maintained and updatable from an instructor perspective.

jeanetteclark commented 5 years ago

I guess this doesn't really resolve our problem of needing to build tagged versions and those versions potentially breaking though...so maybe Travis is the problem

amoeba commented 5 years ago

....so maybe Travis is the problem

😂

If we take Travis CI out of the picture, and instead do something such as check in the rendered Bookdown output into git do things get way simpler with respect to being able to tagging so we can easily go back in time, and maintaining trainings so they always "work"? I think the idea of making sure our trainings our reproducible/buildable isn't nearly as strong a concern as making sure they're good and maintainable.

A central training repo is an interesting idea. Then the long-term maintenance (making sure master builds successfully) is centralized in one place.