opendevstack / ods-pipeline

Alternative ODS CI/CD pipeline based on Tekton / OpenShift Pipelines
Apache License 2.0
13 stars 5 forks source link

Feature/build task overhaul #710

Closed henrjk closed 10 months ago

henrjk commented 1 year ago

This changes build tasks to no longer copy build outputs into the docker-context. This is implemented by having Dockerfile's use files from natural build output location(s) and by default have docker-context to be the same as the repository root. This also allows using the same Dockerfile for ods-pipeline as well as local builds.

In addition build task skipping (and the related caching) is enhanced to to allow multiple input and output locations. cache-build.sh and copy-build-if-cached.sh no longer have an output-dir parameter and instead support parameters:

The build tasks themselves are adjusted accordingly. All build tasks now support parameter --cache-sources with a default "" (empty String) as a task parameter, so that build task skipping is now an opt-in feature.

In addition:

Closes #678

Tasks:

henrjk commented 1 year ago

@michaelsauter, @oalyman as well as anyone else interested, this is ready for feedback.

henrjk commented 10 months ago
  • "cached outputs" controls what gets cached so it feels more suitable to control whether caching is on or off

Python and other dynamic languages where there is nothing to build have no outputs to cache but use the build for automated tests and thus build skipping makes still sense. I suppose it makes more sense to think and explain this feature with the notion of build skipping. I myself have sometimes the caching perspective in mind, and this may shine through in the language I used.

Nonetheless looking at it with the sense of build skipping makes much more sense to me and then enabling/disabling with inputs seems okish for me. An alternative could be to bring the build-skipping-enabled flag back and also add the extra-inputs but the way it is proposed now made sense to me as well.

Let me know how this affects your previous comment!

henrjk commented 10 months ago
  • rename "cache-sources" to "build-inputs" or "relevant-inputs" or "relevant-cache-inputs" (leaning towards the latter). Naturally, this should default to the working dir and I would expect it to be changed by users very rarely (e.g. one example we know where this is helpful is for monorepos where one build depends on the other)

To me "sources" implied that these are build inputs. Nonetheless I also don't mind changing a less generic term and use inputs instead.

Although "relevant-inputs", "relevant-cache-inputs" or "relevant-build-inputs" make sense to me, here is another suggestion:

    - name: build-inputs
      description: |
        List of relevant directories (as colon separated string) with files that influence the build.
        These directories are relative to the repository root.
        If these directories do not change the build is skipped and any files that were 
        previously built are restored as well as the prior build reports. 
        The built files to restore are specified with the "build-outputs" parameter. 
        Otherwise (if build-inputs changed) the build starts from scratch.
        The value "!always-rebuild" means that the build is never skipped.
      type: string
      default: $(parameter.working-dir)

One doubt I have is: can a Task parameter's default be a value of another parameter? If not one would have to use some other magic value or reintroduce the "build-extra-inputs"

michaelsauter commented 10 months ago

After some time I have to say I did not come to a conclusion that feels completely right to me.

I tend towards using "extra inputs" as a param, as that captures the intend for most cases, and avoids having to align this param with the working dir param.

I am not sure how I would implement the possibility to disable skipping. One thing that came to my mind is that if "extra inputs" would be set to a path that always changes, it would effectively disable skipping. However, I do not know of any file in Unix that one can compute the md5 of that would always change (/dev/(u)random is close but we'd need some special handling to read only part of it). If we would like to follow this line of thought, one could write the name of the pipeline run, which I believe to be unique enough, into the ODS context (so the content of ./ods/pipeline-run would equal the name of the pipeline run). Then setting the "extra inputs" param to .ods/pipeline-run the build would never be skipped.

In summary, feel free to pick what feels best to you now, and we can always fine-tune later.

henrjk commented 10 months ago

I was pondering about the random suggestion but for now felt that we should instead reintroduce the flag and have extra inputs. Following is the updated draft of the description:

Draft to be merged

This changes build tasks to no longer copy build outputs into the docker-context. This is implemented by having Dockerfile's use files from natural build output location(s) and by default have docker-context to be the same as the repository root. This also allows using the same Dockerfile for ods-pipeline as well as local builds.

In addition build task skipping (and the related caching) is enhanced to to allow multiple input and output locations. cache-build.sh and copy-build-if-cached.sh no longer have an output-dir parameter and instead support parameters:

--build-extra-inputs: List of build source directories (as colon separated string) which in addition to working-dir influence the build. These directories are relative to the repository root. If the contents in these directories change the cache is invalidated so that the build task will rebuild from scratch.

--cached-outputs: List of build output directories (as colon separated string) to be cached. These directories are relative to the working-dir parameter.

The build tasks themselves are adjusted accordingly. All build tasks now support parameter --build-extra-inputs with a default "" (empty String) as a task parameter.

In addition:

npm: no longer copies build outputs to output-dir. The parameters build-dir, copy-node-modules and output-dir which configure that behavior are removed and the build script is simplified. Instead the task now supports the cached-outputs parameter which defaults to dist. When other locations must be cached, the parameter would have to be adjusted. For example to build:node_modules if an associated Dockerfile would copy from these locations.

python: no longer copies build outputs. The parameter output-dir was removed. In addition as for Python the actual build is meant to happen during image building no cached-outputs are needed and the task always passes an empty string here. Build tasks are still skipped if the working-dir is unchanged.

go does not support cached-outputs on the task level and instead sets the scripts cached-outputs parameter to output-dir.

gradle supports the cached-outputs parameter with a default being 'docker'.