Closed wrussell1999 closed 1 month ago
This appears to also impact io.kestra.plugin.git.Push
as well. Running a flow multiple times at the same time (modifying different files simultaneously) will only generate a single commit.
A simple retry loop with a git pull; git push
solves this in other CICD style jobs on other platforms.
After investigating the issue I have noticed that there is a task race between all the PushFlows. When the tasks begin they each have the same commit ref for the main
branch. Therefore the first flow to finish will be able to push and update the main
branch to the new pushed commit but the others tasks won't be able to do that. The commits are pushed indeed but not to the main
branch as we pull
the branch only once before pushing.
A possible solutions would be to enhance the pushflows method to accept a list of namespace flows and then create a chain of commits and push them all at once. @anna-geller
At the moment a workaround would be to use sequential flow like this one but of course the flow duration increases :
id: push_to_git_sequential
namespace: test.git
tasks:
- id: fetch_namespaces
type: io.kestra.plugin.core.http.Request
uri: http://localhost:8080/api/v1/flows/distinct-namespaces
- id: loop_namespaces
type: io.kestra.plugin.core.flow.EachSequential
value: "{{outputs.fetch_namespaces.body}}"
tasks:
- id: commit_and_push
type: io.kestra.plugin.git.PushFlows
username: mgabelle
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
url: https://github.com/mgabelle/test-git-push-flow
branch: testSequential
sourceNamespace: "{{ taskrun.value }}"
targetNamespace: "{{ taskrun.value }}"
gitDirectory: "flows/{{ taskrun.value }}"
commitMessage: "sequential: changes to kestra flows for {{ taskrun.value }}"
After discussing with Mat, sounds like the "responsibility" of Git: even outside of Kestra, Git cannot guarantee the order in which the pushes will be received by the remote repository. If two users push changes at the same time, Git might reject one the push or show the warning shared by Will in issue description above.
I'm willing to close this issue to open another one to add a reference in the doc of the task to mention that it's not recommended to run such pattern due to the potential for conflicts and data loss. Preferring Sequential workflows or pushing to separated branches.
WDTY @anna-geller ?
That is understandable, sounds like a solid plan! Thanks for handling it Ben!
close in favor of #94
Expected Behavior
When using
EachParallel
to bulk commit multiple namespaces at once, all commits should appear in GitHub. If this is not possible due to the nature of committing multiple times to the same repository, it should fail or mentioned that it's not possible.Looks like Git PushFlows is rewriting the history, rather than pushing a new commit on top. If that's the intention, we should clear that up in the documentation as that prevents parallel use of PushFlows. This commit was accessed from the Kestra logs:
Actual Behaviour
When using
EachParallel
to bulk commit multiple namespaces at once, it comes back as successful but not all of the commits are visible.Only one of the commits was successful:
Steps To Reproduce
Environment Information
Example flow
No response