apache / camel-k

Apache Camel K is a lightweight integration platform, born on Kubernetes, with serverless superpowers
https://camel.apache.org/camel-k
Apache License 2.0
859 stars 344 forks source link

Possible deadlock between integration builds #5755

Closed lsergio closed 1 month ago

lsergio commented 1 month ago

What happened?

After creating 4 integrations at the same time, 3 of the builds are hanging for minutes after the first build has finished.

Steps to reproduce

This will create 4 Builds:

NAME                       PHASE        AGE   STARTED   DURATION   ATTEMPTS
kit-cqqft44lsn2s73ed6ufg   Running      55s   55s                  
kit-cqqft44lsn2s73ed6ug0   Scheduling   55s                        
kit-cqqft4clsn2s73ed6ugg   Scheduling   54s                        
kit-cqqft4clsn2s73ed6uh0   Scheduling   54s                        

Checking the operator log I see:

{"level":"info","ts":"2024-08-08T17:24:23Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft44lsn2s73ed6ug0) to finish in order to use incremental image builds - the build (kit-cqqft4clsn2s73ed6uh0) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-fc949ea7-e5ea-4d59-8fed-83471b9350e0-323","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft4clsn2s73ed6uh0"}
{"level":"info","ts":"2024-08-08T17:24:27Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft4clsn2s73ed6ugg) to finish in order to use incremental image builds - the build (kit-cqqft44lsn2s73ed6ug0) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-a41357e7-14c0-4a64-9933-80044f023ad3-323","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft44lsn2s73ed6ug0"}
{"level":"info","ts":"2024-08-08T17:24:27Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft44lsn2s73ed6ug0) to finish in order to use incremental image builds - the build (kit-cqqft4clsn2s73ed6ugg) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-fc949ea7-e5ea-4d59-8fed-83471b9350e0-381","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft4clsn2s73ed6ugg"}
{"level":"info","ts":"2024-08-08T17:24:28Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft44lsn2s73ed6ug0) to finish in order to use incremental image builds - the build (kit-cqqft4clsn2s73ed6uh0) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-fc949ea7-e5ea-4d59-8fed-83471b9350e0-323","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft4clsn2s73ed6uh0"}
{"level":"info","ts":"2024-08-08T17:24:32Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft4clsn2s73ed6ugg) to finish in order to use incremental image builds - the build (kit-cqqft44lsn2s73ed6ug0) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-a41357e7-14c0-4a64-9933-80044f023ad3-323","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft44lsn2s73ed6ug0"}
{"level":"info","ts":"2024-08-08T17:24:32Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft44lsn2s73ed6ug0) to finish in order to use incremental image builds - the build (kit-cqqft4clsn2s73ed6ugg) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-fc949ea7-e5ea-4d59-8fed-83471b9350e0-381","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft4clsn2s73ed6ugg"}
{"level":"info","ts":"2024-08-08T17:24:33Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft44lsn2s73ed6ug0) to finish in order to use incremental image builds - the build (kit-cqqft4clsn2s73ed6uh0) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-fc949ea7-e5ea-4d59-8fed-83471b9350e0-323","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft4clsn2s73ed6uh0"}
{"level":"info","ts":"2024-08-08T17:24:37Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft4clsn2s73ed6ugg) to finish in order to use incremental image builds - the build (kit-cqqft44lsn2s73ed6ug0) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-a41357e7-14c0-4a64-9933-80044f023ad3-323","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft44lsn2s73ed6ug0"}
{"level":"info","ts":"2024-08-08T17:24:37Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft44lsn2s73ed6ug0) to finish in order to use incremental image builds - the build (kit-cqqft4clsn2s73ed6ugg) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-fc949ea7-e5ea-4d59-8fed-83471b9350e0-381","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft4clsn2s73ed6ugg"}
{"level":"info","ts":"2024-08-08T17:24:38Z","logger":"camel-k.controller.build","msg":"Waiting for build (kit-cqqft44lsn2s73ed6ug0) to finish in order to use incremental image builds - the build (kit-cqqft4clsn2s73ed6uh0) gets enqueued","request-namespace":"a1b2c3d4e5","request-name":"deploy-fc949ea7-e5ea-4d59-8fed-83471b9350e0-323","order-strategy":"dependencies","api-version":"camel.apache.org/v1","kind":"Build","ns":"a1b2c3d4e5","name":"kit-cqqft4clsn2s73ed6uh0"}

It seems like each build is waiting for another one to finish;

Relevant log output

No response

Camel K version

2.4.0

lsergio commented 1 month ago

Adding the trait:

    builder:
      incrementalImageBuild: false

to each Integration or to the IntegrationPlatform didn't change the results.

lsergio commented 1 month ago

The same scenario with Camel K 2.3.3 worked as expected. The builds ran sequentially.

lsergio commented 1 month ago

Trying to understand the code, I guess this is related to how this method is implemented.

BuilderDependencies() will load the dependencies from the builder task in each build. Taking one of my builds as an example, I see the list:

      dependencies:
      - camel:quartz
      - mvn:org.apache.camel.k:camel-k-cron
      - mvn:org.apache.camel.k:camel-k-runtime
      - mvn:org.apache.camel.quarkus:camel-quarkus-yaml-dsl

These strings do not take into account the jar version. If two builds use the same Camel components, but with different camel-k-runtimes, HasMachingBuild will work as if the dependencies were the same, but they are not.

lsergio commented 1 month ago

This difference between 2.3.x and 2.4.0 is due to https://github.com/apache/camel-k/pull/5669. Indeed, changing the build order strategy to sequential in 2.4.0 fixes the issue and replicates the 2.3.x behavior.

squakez commented 1 month ago

Thanks for reporting. Yes, apparently the dependencies strategy is having some problem. We need to fix it, but in the while, the solution is to revert the build order strategy.

squakez commented 1 month ago

BTW, feel free to work on its resolution if you have the availability to.

lsergio commented 1 month ago

After more troubleshooting, I figured out that the problem happens if 2 builds have exactly the same dependencies and have been created within the same second. This condition was supposed to give priority to whatever build was created first, but as the creationTimestamp precision is down to the second only, we may have equal timestamps and no build will be prioritized.

I think that in this case we might prioritize based on anothet criteria, like the build name, which is unique.

@squakez I'll try to come up with a solution and submit a PR. It this gets more complicated than I antecipated I'll let you know so we can unsassign the issue.

squakez commented 1 month ago

Excellent, thanks. Sure, just let me know how it goes.

lsergio commented 1 month ago

The scenario was a bit trickier. I opened a draft PR for review, but I still need to build the operator and test it on a K8s cluster. I'll do it in the next few days.