Open Sanne opened 1 month ago
Makes sense to me, though I wasn't present for these "long-ago problems". In any case +1 for consistency between native and JVM.
We would still be able to revert the behaviour, which has been useful to verify consistency at build time.
So essentially, flip the default in general, but override it in some integration tests that are known to not need it, just to be sure? Seems nice.
So essentially, flip the default in general, but override it in some integration tests that are known to not need it, just to be sure?
Exactly: some integration tests or when doing manual investigations. GraalVM would also consider to make the switch eventually.
currently the use of this driver triggers the use of
--allow-incomplete-classpath
but this has global impact,
Related upstream issue https://github.com/oracle/graal/issues/3929 asking for more granular control.
I've discussed this pending problem with @christianwimmer @jerboaa @maxandersen today and we came to the conclusion that perhaps native-image should default to the same semantics as hotspot: to have the --allow-incomplete-classpath set by default.
FWIW, --allow-incomplete-classpath
is already the default in GraalVM since GraalVM/Mandrel 22.1 (~2 years ago) and we countered it by passing --link-at-build-time
see https://github.com/quarkusio/quarkus/pull/24213
That said, in order to
Set --allow-incomplete-classpath by default in when Quarkus runs a native compilation
You essentially need to avoid passing --link-at-build-time
, which can be achieved through the NativeImageAllowIncompleteClasspathBuildItem
.
This would resolve the immediate issues with those JDBC drivers: we wouldn't need substitutions in MSSQL,
Why are these substitutions bad? They help GraalVM eliminate unused code, while still providing a nice error message to the users. I understand there is maintenance overhead, but I don't think they are bad.
the Oracle driver wouldn't be setting an untested global flag
The Oracle driver should already not be setting that flag :)
and possibly other libraries would benefit as well as it brings us semantics closer to hotspot.
That's not necessarily a good thing.
I would suggest we drop the --link-at-build-time
requirement for user code only (similar to build-time-initialization as discussed in https://github.com/quarkusio/quarkus/discussions/42682). Keeping the requirement for our extensions is a good thing IMHO.
Being notified that a class won't be reachable at run-time allows us to detect potential targets of improvement, e.g. we can most probably substitute code to avoid the broken dependency while improving dead code elimination.
Update: Another related issue https://github.com/quarkusio/quarkus/issues/2101
This would resolve the immediate issues with those JDBC drivers: we wouldn't need substitutions in MSSQL,
Why are these substitutions bad? They help GraalVM eliminate unused code, while still providing a nice error message to the users. I understand there is maintenance overhead, but I don't think they are bad.
Certain substitutions are problematic since those optional dependencies solve problems for users who actually do need them. For example some feature of a DB driver that has been cut out (with substitutions). In the case of allowing incomplete classpath, the feature would be available if the optional dependencies are there at build time. On the other hand, the feature wouldn't be there even if the optional deps existed at build time (status quo).
On the other hand, the feature wouldn't be there even if the optional deps existed at build time (status quo).
That's a valid point. We could make such substitutions optional (only when the dependencies are absent) using onlyWith
. Of course this will only make sense if we observe some notable difference that justifies the substitutions in the first place.
Related upstream issue https://github.com/oracle/graal/issues/3929 asking for more granular control.
More granular control was the first thing that came to my mind when I read this issue
I would suggest we drop the --link-at-build-time requirement for user code only (similar to build-time-initialization as discussed in https://github.com/quarkusio/quarkus/discussions/42682). Keeping the requirement for our extensions is a good thing IMHO.
+1
On the other hand, the feature wouldn't be there even if the optional deps existed at build time (status quo).
That's a valid point. We could make such substitutions optional (only when the dependencies are absent) using
onlyWith
. Of course this will only make sense if we observe some notable difference that justifies the substitutions in the first place.
There is an even more advanced way of handling cases like this: We can generate a substitution class at Quarkus build time :)
Description
The GraalVM
native-image
tool can take an option--allow-incomplete-classpath
which changes how it deals with unresolved symbols found during reachability analisys. Normally, when running an application on a regular JVM, code can refer to optional dependencies and the application can still boot; if the code is triggered at runtime and refers to a class which can't be loaded, a CNFE is thrown - but in native-image, when this flag is not set, resolution is validated at build time.In the past we did like the native-image approach, for several reasons:
--allow-incomplete-classpath
was having some additional odd side-effects, such as making reachability reports more confusing (this was most likely a bug).In hotspot it wouldn't have been reasonable to have a similar validation, as it needs to support non-closed-world use cases. Essentially I (personally) considered it a "nice feature" of native-image, however it comes with problems: some libraries actually do rely on optional dependencies and don't use a clean decoupling mechanism like
ServiceLoader
. I have been hoping that the native-image approach would encourage library maintainers to use theServiceLoader
mechanism, but it's been some years and we still have problems with at least two notable dependencies:Both of these libraries have several optional dependencies; in the case of the MSSQL driver we patch this with subsitutions but this causes problems for people who actually want to use the optional capabilities. In the case of the Oracle driver we can't do much as it's closed source - currently the use of this driver triggers the use of
--allow-incomplete-classpath
but this has global impact, so it raises quality concerns as this mode is less so tested in combination with other extensions.I've discussed this pending problem with @christianwimmer @jerboaa @maxandersen today and we came to the conclusion that perhaps native-image should default to the same semantics as hotspot: to have the
--allow-incomplete-classpath
set by default. The "odd side effects" that we had observed in early days are very likely no longer applicable, as it's been years since that time.The proposal here is to set this flag by default in Quarkus, to verify that it doesn't break any other extension and has no other surprises; based on our feedback, native-image might also flip its default.
This would resolve the immediate issues with those JDBC drivers: we wouldn't need substitutions in MSSQL, the Oracle driver wouldn't be setting an untested global flag, and possibly other libraries would benefit as well as it brings us semantics closer to hotspot.
We would still be able to revert the behaviour, which has been useful to verify consistency at build time.
Implementation ideas
--allow-incomplete-classpath
by default in when Quarkus runs a native compilation