Closed ishitatsuyuki closed 6 years ago
One thing I noticed is that we may be facing endless PR builds filling the queue. Probably we should add --cache-from for other images as well?
I disagree that we should put the scripts inside the yaml. The size of the whole script is pretty large to be copy pasted for each jobs, and unfortunately yaml anchors cannot be used as well because it's a list not mapping. Editing such long script inside yaml is a pain: you get excessive indents, and miss syntax highlighting for bash. This also makes modifying the script harder; instead of using variables, they're hardcoded and insanely hard to be replaced.
Also, having a separate deploy step sounds good at first. However, the drawback is that you now need to persist much more images, taking the risk of hitting size limit, and as well incurs network overhead.
To sum up, I think putting everything inside a script is not a bad thing. We're not hiding anything, it's just a piece of readable and maintainable code.
Thank you for explaining your concerns. We both want what's best for the project, and I'm confident we can reach a shared understanding of the situation that will lead us to the best solution to enable CI for pull requests.
The size of the whole script is pretty large to be copy pasted for each jobs
It is indeed very large right now, but I feel that's mostly due to the fact that we attempt to do many things conditionally, while these things could be avoided (the complicated DOCKERHUB_TAG
) or rolled into each job where they make sense (e.g. instead of writing if [[ $CIRCLE_JOB = "ubuntu-dev ]]
inside each job, we just add the relevant instruction to the ubuntu-dev
job, and not in any other job).
Editing such long script inside yaml is a pain
Again, I don't think the in-yaml bash commands would be as long as the original script. If you look at my suggested job setup and workflow, it's always 3 or 4 command lines at most per job, and they're extremely easy to read, understand and modify if needed.
Also, having a separate deploy step sounds good at first. However, the drawback is that you now need to persist much more images, taking the risk of hitting size limit, and as well incurs network overhead.
That's a worrying concern. I don't know what the cache size limit is, or if there are network usage limits on Circle CI.
However, I still think it's worth trying the simplest solution first, and then react to any problems/limitations if they arise, instead of falling into the trap of premature optimization.
Again, I don't think the in-yaml bash commands would be as long as the original script. If you look at my suggested job setup and workflow, it's always 3 or 4 command lines at most per job, and they're extremely easy to read, understand and modify if needed.
3 or 4 is already long enough. There are lots of repeated words where you can easily miss one. They're easy to read, but not easy to modify. You get 8x amplification for modifying the script, and 4x amplification for changing/adding a new name.
The same applies to a separate deploy step. You need to copy two more blocks for each new image. From a contributor's perspective, this cannot be really called simple.
(By the way, as an alternative, CircleCI parallelism, or Travis matrix allows you to achieve job generation with more minimum yaml. The parallelism has fixed count, so it may be suboptimal for this pattern though.)
3 or 4 is already long enough. There are lots of repeated words where you can easily miss one. They're easy to read, but not easy to modify.
I agree that we shouldn't make mistakes in our Circle CI job commands. That's why we have code review, and it's pretty obvious when problems still slip through, because Circle CI will tell us about it when jobs fail (and it won't break anything in production, so it's not too dramatic).
I argue that 8 blocks of similar 3 or 4 lines are fairly easy to refactor (even though we don't expect to do that very often), e.g. with vim macros, or multi-cursor, or similar multi-editing tool.
You get 8x amplification for modifying the script, and 4x amplification for changing/adding a new name.
I agree that a script allows to factor out similarity between all the jobs. While we should mind the trade-off between aggressively de-duplicating all the commands, and keeping all the code easy to read and to understand by anyone, I'm not fundamentally opposed to having a script.
The same applies to a separate deploy step. You need to copy two more blocks for each new image. From a contributor's perspective, this cannot be really called simple.
Actually, people aren't supposed to contribute to this repository directly. Ideally, to add a new project to Janitor, a contributor would write a development Dockerfile based on ubuntu-dev
, and host it in the project's own repository (that's what Kresus and PeerTube already do, and eventually I'd like all or most of the dockerfiles in this repository to be migrated away into their respective project's repository).
So the number of projects in your circle.yml
is supposed to be constant, and shrink over time.
Here is a summary of my position on this pull request, as it has changed:
if [[ $JOB = "particular-job" ]]; then # job-specific commands
. That makes no sense, and any instructions in such blocks should live directly in the relevant jobs's circle.yml
commands.DOCKERHUB_TAG
I introduced to be removed, because it's not useful (we should only push :latest
, or not push at all).firefox.tar
would be much heavier than ubuntu-dev.tar
, so maybe you're right and we should directly push from the same job we built it)..dockerfile
and pre-pending janitortechnology/
is too confusing and has too little benefits.I'm happy to review a future iteration of you script with my nits addressed. Thanks again, a lot, for all your hard work! I really appreciate it. 👍 💯
I think I have addressed the concerns. Please give another review.
Re-triggering a build.
Hmm, the trick didn't work. @jankeromnes can you push the rebuild button, and review afterwards?
@jankeromnes can you push the rebuild button
Sure! Done. Thanks a lot for all your hard work on this! I'll try to find some time to review this soon (I'm very sorry about my review lag).
Addressed the comments. I've swapped out the debloat commits (as it's still pending review), but keep in mind that there was some cases where a too big ubuntu-dev can fail to be saved.