NiklasRosenstein / flux-ci

Flux is your own private & lightweight CI server.
MIT License
26 stars 10 forks source link

Add CiFile - Parser and convertor for streamlined CI pipeline/job config file #61

Closed gsantner closed 5 years ago

gsantner commented 6 years ago

This adds the base for resolving issues #24 #50 and #55. It sports the quite often mentioned Ci-File parser I have working on many nights. I have tested it with simple and complex Ci-Files in GitLab CI format, which all is too FOSS, open spec and one of the most powerfullies available currently.

The goal of this PR is not to integrate it into Flux CI already, this is a bigger task, where I'm out of in depth info of Flux CI. You can see an example on the very bottom, and test things out (too included in file as__main__).

  file_to_load = "user/project/.ci.yml"
  cifile = CiFile(variables = {"BUILD_COOL_FEATURE": "1"}).read_cifile(fyaml=file_to_load)
  cifile.prepare_run_with("master","trigger","Very cool\nSome more descriptive commit text","commitsha5ar55f56s87ng8z98z4ß9z1t98gr3", ci_commit_tag=None)

  # Example usage for CIs
  matrix = cifile.convert_to_matrix()
  for i, stagename in enumerate(matrix[MATRIXKEY_STAGES]):
    for jobname, job in matrix[MATRIXKEY_JOBS][i].items():
      # We have now details of a job, lets execute script
      # which is a list of commands to execute one after other in shell / cmd
      for cmd in job[JOBKEY_SCRIPT]:
        exitcode = print("shell.execute(cmd)")
        if exitcode != 0:
          pass # Don't let start any other jobs after we got info that one thing failed

This basically is what we have todo:

Overall Preperation step:

Run preperation:

Run:

One big note: The CiFile logic already does implement job conditional filtering internal, this is something that doesn't need to be implemented separately.

-- License Note: I have this class over at https://github.com/gsantner/opoc/blob/master/python/services/cifile.py, but herebly grant the project to use it freely/fully re-licensed under the Flux projects MIT license.

gsantner commented 6 years ago

BTW, I have made many comments so this is understandable for everybody. Take a look at that too - it basically says most of PR message too :)

NiklasRosenstein commented 6 years ago

Thanks a lot for your work @gsantner. This will take some time to check out and review. I hope to have that time next week or the week after!

gsantner commented 6 years ago

ehm, is it awaitable to get this merged and worked on? took quite much time to get it in that working state :D

NiklasRosenstein commented 6 years ago

With the exmaple CI file that you linked (https://gitlab.com/gsantner/kimai-android/raw/master/.gitlab-ci.yml) on branch release you get these stages and jobs:

Stage: before_script
  Job: before:script
    bash -c '(while [ 1 ]; do sleep 5; echo y; done) | ${ANDROID_HOME}/tools/android update sdk -u -a -t "build-tools-${ANDROID_BUILD_TOOLS},android-${ANDROID_TARGET_SDK}" ; exit 0'
Stage: build
  Job: debug
    ./gradlew assembleDebug
  Job: release
    ./gradlew assembleRelease

Just so I understand this right, we concatenate the jobs in every stage with every other job to get a single pipeline?

So the above would give you two jobs running in Flux CI, running the following Stage's Jobs

If we had another Job in the before_script stage (which we usually wouldn't have but just assuming that there could be another Job, it could by any other stage as well), eg. before:script2 we would get

?

gsantner commented 6 years ago

before_script and after_script are basically two special things - to run before anything else and after everything else. What we are doing here is reshaping it to be a job just like everything else (so you can iterate over it and put into common hierachy).

Just so I understand this right, we concatenate the jobs in every stage with every other job to get a single pipeline?

We put the job infomration all in same level and add global information (like CI_JOB_ID submitted from Flux CI) to all jobs. We don't put one job into another. The other thing is the hierachy (matrix view) pipeline -> stage -> jobs, which is how its supposed to run / looked at CI run.

This all above is the matrix view of the data, at the core its just an array of jobs with all having same properties. That was the main point of this CiFile work, that all is streamlined.

So the above would give you two jobs running in Flux CI, running the following Stage's Jobs

No it's will be stage called before_script, with one job before:script, and after that (when succesfull) a stage build with debug and release jobs (which could be executed in paralell theroetically.

If we had another Job in the before_script stage (which we usually wouldn't have but just assuming that there could be another Job, it could by any other stage as well), eg. before:script2 we would get

No it would be

happy if I can tell more if something unclear

NiklasRosenstein commented 6 years ago

before_script and after_script are basically two special things - to run before anything else and after everything else. What we are doing here is reshaping it to be a job just like everything else (so you can iterate over it and put into common hierachy).

I understand that they are special because they are actually executed around any other job, and not independent jobs. But then I'm not sure why they are put into their own stages and jobs -- wouldn't it make more sense to put them into a Job directly, as in

before_script:
- echo "before_script"

after_script:
- echo "after_script"

build:
  script:
  - echo "build:script"

The before_script and after_script would then basically be inserted into the build Job.

build:
  script:
  - echo "before_script"
  - echo "build:script"
  - echo "after_script"

Maybe I also misunderstand the way you were planning on accessing and executing the before and after scripts per job.

gsantner commented 6 years ago

I understand that they are special because they are actually executed around any other job, and not independent jobs. But then I'm not sure why they are put into their own stages and jobs -- wouldn't it make more sense to put them into a Job directly, as in

Yes! But other CIs usually have this special two cases, and we are doing it cleverer by just converting it to be like a normal job :D. (As in, to make existing cifiles easy reusable, without big modifications)

Why own stages? Because this >Only when all jobs in a stage complete successfully the next stage is allowed to start<. So they are ensured to be run before/after everything else, and nothing else is in the same stage gets executed.

Currently there would be one special case, were you really manually write stage: before_script, but well I think in this case a user really really wants to have it like this :'D.

The before_script and after_script would then basically be inserted into the build Job.

I don't quite understand this one. One job has as much script lines how it wants, but this way we have a concated job with no differentation what has to be run before and after:

before:script:
  stage: before_script
  script:
     - bash "cool_script.sh"
     - rm -rf log.txt
     - mail -s "everything succeed"

How would we know then that bash,rm belongs to start and mail to the very end of all jobs?

Maybe sorry, didn't describe that: one script part is a ordered list of shell executions, that all have to succeed for the job to go green. (heres a litte better example than mine: https://docs.gitlab.com/ee/ci/examples/test-and-deploy-ruby-application-to-heroku.html)

NiklasRosenstein commented 6 years ago

Why own stages? Because this >Only when all jobs in a stage complete successfully the next stage is allowed to start<. So they are ensured to be run before/after everything else, and nothing else is in the same stage gets executed.

This only makes sense to me if you assume that all stages are executed in the same folder and without resetting or flushing the folder between jobs. Does that apply?

How would we know then that bash,rm belongs to start and mail to the very end of all jobs?

Well we know it from the original CI input file -- the global before_script would be a shortcut to adding the commands before the actual commands of a job.

Once we have a full description of the commands for a job, it is no longer important whether a command was part of the before_script or after_script section, as they have been placed in the semantically correct place in the Job's scripts.

gsantner commented 6 years ago

This only makes sense to me if you assume that all stages are executed in the same folder and without resetting or flushing the folder between jobs. Does that apply?

Exactly. When a job starts, the pwd should be the git root folder, when calling - cd docs and - bash makedocumentation.sh it will be in docs in second script line (as chdir before). But still all other jobs should start with git root folder indipendently (otherwise it's impossible to use jobs :D ).

At least till jobs etc would work, thats exactly what I thougt. When we have script/jobs/stages/pipeline working, we could then go on for respecting the dependencies: and artifacts: tags, which control what gets shared between jobs. But that might get quite tricky. Would not go/think of that in the current state.


Well we know it from the original CI input file -- the global before_script would be a shortcut to adding the commands before the actual commands of a job.

Aah, I think we are talking both about something differently! :D Oh I see actually GitLab CI seems todo that differently or at least I thought so. In GitLab CI it works like you said, with putting all lines before and after job script (didn't use that recently :D). Sorry! Forget what I said before about before/after script! :D

I see now. We do want it the way of before/after script of all jobs instead of "before/after pipeline" (thats currently) right? Can change that

grafik

NiklasRosenstein commented 6 years ago

All jobs should be run independently from each other with no leftovers from a previous job (so basically a fresh checkout from the repository). That is why I thought it was a good idea to have the before_script and after_script prepended/appended to the actual job's script. And that is also why it didn't make sense to me to put the before_script and after_script into their own stages.

At least till jobs etc would work, thats exactly what I thougt. When we have script/jobs/stages/pipeline working, we could then go on for respecting the dependencies: and artifacts: tags, which control what gets shared between jobs. But that might get quite tricky. Would not go/think of that in the current state.

I think sharing between jobs is an interesting feature, but is that actually a feature in the GitLab pipelines? I can't seem to find info about that anywhere. Artifacts are usually what you can download from the CI after the Job has run.

gsantner commented 5 years ago

Closing as it looks like the project doesn't really goes on. Even though that one is mostly complete, there is still stuff left todo to actually use it into the whole. Which I assume will not happen