keycloak / keycloak

Open Source Identity and Access Management For Modern Applications and Services
https://www.keycloak.org
Apache License 2.0
21.47k stars 6.51k forks source link

Build of 24.0.4 fails on NFS when deleting lib/quarkus #29758

Open natechols opened 2 months ago

natechols commented 2 months ago

Before reporting an issue

Area

dist/quarkus

Describe the bug

I am trying to run bin/kc.sh build in Keycloak 24.0.4 on various NFS volumes on Linux, and it consistently fails trying to delete the directory lib/quarkus, due to the presence of .nfsXXXX... files that usually indicate another process using the same directory. This was reproducible on multiple nodes and filesystems including in fresh directories - there's definitely nothing else using that directory, so we suspect there's a race condition somewhere in the build logic. Once this error occurs, subsequent attempts to run the build also fail due to the missing files in that directory, since the build already wiped them out without a backup. (I have observed the latter failure before, starting with Keycloak 17, but in that case the root cause was a system glitch - it still broke the Keycloak install though.)

Version

24.0.4

Regression

Expected behavior

  1. Build command succeeds on NFS volumes
  2. Failure of build command does not leave installation in an unusable state

Actual behavior

First error output (with sanitized paths): ERROR: io.quarkus.builder.BuildException: Build failure: Build failed due to errors [error]: Build step io.quarkus.deployment.pkg.steps.JarResultBuildStep#buildRunnerJar threw an exception: java.lang.IllegalStateException: java.nio.file.FileSystemException: /nfs/home/nat/keycloak/keycloak-24.0.4/lib/quarkus/.nfs804a7a64040362c3000c81f2: Device or resource busy After re-running: Exception in thread "main" java.nio.file.NoSuchFileException: /nfs/home/nat/keycloak/keycloak-24.0.4/lib/quarkus/quarkus-application.dat

How to Reproduce?

Aside from setting up the Java 17 environment from Oracle, the only command I'm running is this: rm -rf keycloak-24.0.4* && wget https://github.com/keycloak/keycloak/releases/download/24.0.4/keycloak-24.0.4.tar.gz && tar zxf keycloak-24.0.4.tar.gz && ./keycloak-24.0.4/bin/kc.sh build This works on local filesystems, but has failed on every NFS mount that I've tried.

Anything else?

I tried building on a local volume and relocating to NFS, but it looks like the change in path is enough to trigger the build.

shawkins commented 2 months ago

This works on local filesystems, but has failed on every NFS mount that I've tried.

It doesn't seem like anything keycloak specific is coming into play here. Can you reproduce this behavior with a simple quarkus app?

cc @geoand

Failure of build command does not leave installation in an unusable state

Does using "./kc.sh -Dquarkus.launch.rebuild=true --help" get it back to a usable state?

natechols commented 2 months ago

Can you reproduce this behavior with a simple quarkus app?

Is there a downloadable example I can use somewhere or do I need to follow the tutorial (https://quarkus.io/guides/getting-started)?

Does using "./kc.sh -Dquarkus.launch.rebuild=true --help" get it back to a usable state?

Nope:

./bin/kc.sh -Dquarkus.launch.rebuild=true --help
The DelayedHandler was closed before any children handlers were configured. Messages will be written to stderr.
2024-06-05 10:08:27,317 TRACE [java.io.serialization] (main) Builtin factory: null -> new: null

2024-06-05 10:08:28,305 TRACE [java.io.serialization] (main) Builtin factory: null -> new: null

Exception in thread "main" java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at io.quarkus.bootstrap.runner.QuarkusEntryPoint.doReaugment(QuarkusEntryPoint.java:90)
        at io.quarkus.bootstrap.runner.QuarkusEntryPoint.doRun(QuarkusEntryPoint.java:49)
        at io.quarkus.bootstrap.runner.QuarkusEntryPoint.main(QuarkusEntryPoint.java:33)
Caused by: java.nio.file.NoSuchFileException: /home/nat/bugs/keycloak-24.0.4/lib/quarkus/build-system.properties
        at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
        at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)
        at java.base/java.nio.file.Files.newByteChannel(Files.java:380)
        at java.base/java.nio.file.Files.newByteChannel(Files.java:432)
        at java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:422)
        at java.base/java.nio.file.Files.newInputStream(Files.java:160)
        at io.quarkus.deployment.mutability.ReaugmentTask.main(ReaugmentTask.java:35)
        ... 7 more
shawkins commented 2 months ago

Is there a downloadable example I can use somewhere or do I need to follow the tutorial

Following one of the getting started examples should be fine, it should implicitly augment on start. To fully mimic what keycloak is doing you would use https://quarkus.io/guides/reaugmentation - you just need to include the mysql extension in whatever example you use to use the prompts provided.

natechols commented 1 month ago

I am not a Java expert and am working on this in parallel with other tasks, so it's going to take me a while to debug any further. Related question: is there any way to deploy Keycloak in a pre-built state that will skip this step entirely? What conditions force a rebuild?

shawkins commented 1 month ago

@natechols if you are failing at the build command, then you unfortunately are stuck. That is a pre-requisite to running start --optimized, which skips the build. Almost anything else you do will implicitly check if a build is needed and perform one.

natechols commented 1 month ago

How do I view the DEBUG-level logging from Quarkus when running kc.sh build? It doesn't support the --log-level argument, and setting KEYCLOAK_LOGLEVEL=DEBUG (as suggested elsewhere) didn't have any effect.

shawkins commented 1 month ago

How do I view the DEBUG-level logging from Quarkus when running kc.sh build? It doesn't support the --log-level argument, and setting KEYCLOAK_LOGLEVEL=DEBUG (as suggested elsewhere) didn't have any effect.

Sorry for the delay in replying. This is a nuance of how runtime options are excluded for the build command - however log-level should be considered applicable. A workaroud is to use an environment variable:

KC_LOG_LEVEL=debug ./kc.sh build
shawkins commented 1 month ago

~priority-low

keycloak-github-bot[bot] commented 1 month ago

Due to the amount of issues reported by the community we are not able to prioritise resolving this issue at the moment.

If you are affected by this issue, upvote it by adding a :thumbsup: to the description. We would also welcome a contribution to fix the issue.