betterscientificsoftware / bssw.io

Better Scientific Software Homepage
https://bssw.io
Other
140 stars 89 forks source link

Consider automatically rebuilding preview site #711

Open bernhold opened 3 years ago

bernhold commented 3 years ago

Parent Issues:

Description

This is actually @bartlettroscoe 's suggestion:

Poll the repository periodically and rebuild the preview site when there are changes.

bernhold commented 3 years ago

In the past, we've been throttled by the GH API when we rebuild too frequently. Since then, the front-end has been modified to greatly reduce the number of API calls needed, and we haven't seen any throttling in a long time.

Clara made this comment:

We'd still probably get throttled if, say, we rebuilt every minute, but rebuilding 10x more often than we currently are shouldn't pose any concerns.

Though for the most part, we're not building particularly often (every few days). We might need to do a stress test to check.

bartlettroscoe commented 3 years ago

In the past, we've been throttled by the GH API when we rebuild too frequently.

@bernhold, what are the details for that? I have never heard of throttling of git fetch with regular GitHub git repos. I have only heard of throttling for GitHub Actions, the GitHub REST API, etc. For instance, I ran a post-push CI server for Trilinos for years that polled the repo every 3 minutes and I don't think we ever saw a problem with limiting or throttling access to the github repo for git fetch. And Trilinos is way bigger than 'bsswi.o' + 'images', even without using git-lfs.

I can't seem to find a limit on regular git fetch and push with GitHub anywhere. Ironically, all I can find is limits when using git-lfs:

(with 1 GB per month of bandwidth which is not much).

There was this:

from 2015 that said:

If your bandwidth usage significantly exceeds the average bandwidth usage (as determined solely by GitHub) of other GitHub customers, we reserve the right to immediately disable your account or throttle your file hosting until you can reduce your bandwidth consumption.

but I would guess this little bssw.io repo is not out so different from other repos so I would like to see some evidence for "throttling" of git fetch operations.

Hopefully the web contractor is keeping a local clone and not recloning all of the time so large image files will only be downloaded when they are updated in the main repo (and then it seems that we only care about limits when using git-lfs as per #703).

bartlettroscoe commented 3 years ago

Related to my epic SEPW-211

bartlettroscoe commented 2 years ago

CC: @betterscientificsoftware/bssw-editorial-board

FYI: We discussed this in some detail on the BSSw.io Operations Meeting on 1-20-2022.

We really need this to streamline the authorship and review process for adding new content and updating existing content.

bernhold commented 2 years ago

I think part of my concern about this is that commits often come in bursts, and it may be counter-productive to rebuild on the first or on a schedule -- you'll only have to wait until that rebuild completes and trigger another to get what you really want to preview onto the site.

bartlettroscoe commented 1 year ago

@rinkug and @bernhold, this issue #711 is about automatically rebuilding the preview site which is what we were discussing at the meeting last week. The issue #856 is about automatically rebuilding the preview branch (which is a more complex operation).

As for automatically rebuilding the preview site (for whatever is in the 'preview' branch), I put together a very simple example in a short bash script in:

A more sophisticated version of a Git fetch looping demon is shown in:

That example uses a Python tool called generic-looping-demon.py that is a little more sophisticated about looping and termination but it is not needed.

bernhold commented 2 months ago

Thinking about this, it might be something we could implement as a GitHub action. Triggered manually, or perhaps by a push to the preview branch, it could use curl to poke the rebuild button, with the username/pw stored using the GH secrets mechanism.

You would lose the feedback from the rebuild page, as to whether or not it is done. Relatedly, while the rebuild mechanism will reject new requests while a build is in progress (until a 10 minute timer expires), but you wouldn't be able to see this if it were triggered via an action either.

bartlettroscoe commented 1 month ago

FYI: We just discussed this topic again over the email list yesterday (in the context of auto-rebuilds of the production bssw.io site). One of the editorial members was assuming the bssw.io site was being rebuilt automatically.

bartlettroscoe commented 1 month ago

Thinking about this, it might be something we could implement as a GitHub action. Triggered manually, or perhaps by a push to the preview branch, it could use curl to poke the rebuild button, with the username/pw stored using the GH secrets mechanism.

If we could implement this just from a GHA job and make that robust, then that would be ideal.

You would lose the feedback from the rebuild page, as to whether or not it is done. Relatedly, while the rebuild mechanism will reject new requests while a build is in progress (until a 10 minute timer expires), but you wouldn't be able to see this if it were triggered via an action either.

If you could implement a blocking GHA that waited until the rebuild was complete, you could scrape the output and post a link to the rebuilds page in the GHA output. From that, someone could click that link and see what happens. (it is very rare that the rebuild crashes.)