eclipse-jdt / eclipse.jdt.core

Eclipse Public License 2.0
149 stars 114 forks source link

ecj spams the console with irrelevant zip exceptions #1578

Open stephan-herrmann opened 7 months ago

stephan-herrmann commented 7 months ago

Caused by this change from #239:

When compiling on JRE 1.8, the compiler spits out tons of exceptions like this:

Failed to init Classpath for jar file /somepath/j2sdk/linux/1.8.0_111/jre/lib/fontconfig.SuSE.10.properties.src
java.util.zip.ZipException: zip END header not found
    at
java.base/java.util.zip.ZipFile$Source.findEND(ZipFile.java:1469)
    at
java.base/java.util.zip.ZipFile$Source.initCEN(ZipFile.java:1477)
    at
java.base/java.util.zip.ZipFile$Source.<init>(ZipFile.java:1315)
    at
java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1277)
    at
java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:243)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:172)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:186)
    at
org.eclipse.jdt.internal.compiler.batch.ClasspathJar.initialize(ClasspathJar.java:204)
    at
org.eclipse.jdt.internal.compiler.batch.FileSystem.<init>(FileSystem.java:235)
    at
org.eclipse.jdt.internal.compiler.batch.Main.getLibraryAccess(Main.java:3504)
    at
org.eclipse.jdt.internal.compiler.batch.Main.performCompilation(Main.java:4750)
    at
org.eclipse.jdt.internal.compiler.batch.Main.compile(Main.java:1802)
    at
org.eclipse.tycho.compiler.jdt.JDTCompiler.compileInProcess(JDTCompiler.java:370)

It does so for every file in that jre/lib directory.

I admit, that the strategy to detect zip files is a bit crude, but it looks like no reliable positive specification of zip file names was available when that code was written.

Hence the ZipException is perfectly legal in this situation, and silently catching it with an // ignore comment was by full intention.

Can we please get rid of this particular console logging?

laeubi commented 7 months ago

Should one not better fix that a step before and look if the file has a zip magic header at all ?

stephan-herrmann commented 7 months ago

Should one not better fix that a step before and look if the file has a zip magic header at all ?

If that can be done without performance impact (accessing the file twice?), be my guest! :)

Just the combination of expected exception and console logging is wrong, to be resolved one way or other.

laeubi commented 7 months ago

Maybe you can set a breakpoint in ClasspathJar constructor to see where this originates, but usually this is printed if a jar has a classpath manifest header that is invalid/not found so it is not that useless at all, so I assume it comes from org.eclipse.jdt.internal.compiler.batch.ClasspathJar.fetchLinkedJars(ClasspathSectionProblemReporter) where it already checked if it is an existing file.

The problem is that here "something" seems to reference resource files as class-path, so a cheap check might be to check if it is actually a file ending in jar and do further investigation if it is at least a zip file, not sure if reading two bytes from a file is assumed a large performance problem here.

stephan-herrmann commented 7 months ago

My link in the description already points to the exact code location in FileSystem.

I think I saw the list of files collected in Main.getLibraryFiles(), where a preliminary check by file extension is already performed. That's the location that I commented above saying

I admit, that the strategy to detect zip files is a bit crude, but it looks like no reliable positive specification of zip file names was available when that code was written.

laeubi commented 7 months ago

Hm... then I'm maybe on the wrong track here but it then seems a bit strange that this is reported as a module...

stephan-herrmann commented 7 months ago

... but it then seems a bit strange that this is reported as a module...

Who speaks of module?

laeubi commented 7 months ago

Is it possible that the stacktrace is not from the latest ecj compiler?

stephan-herrmann commented 7 months ago

Is it possible that the stacktrace is not from the latest ecj compiler?

Yes, line number may likely differ from HEAD.

jukzi commented 7 months ago

@iloveeclipse i suggest to

  1. do not log stacktrace for ZipException - only log filename
  2. be totally silent if the filename does not end with ".jar"
iloveeclipse commented 7 months ago

When compiling on JRE 1.8, the compiler spits out tons of exceptions It does so for every file in that jre/lib directory

Stephan, could you please explain what do you mean by "compiling on JRE 1.8"? I believe we require Java 17 minimum in ecj? Do you mean by specifying 1.8 JRE as dependency maybe? How exactly do you specify that? Or are you running older ecj version on 1.8? In that case, updating to new ecj should fix the problem without any change, as 1.8 is not supported platform to run ecj on anymore.

be totally silent if the filename does not end with ".jar"

I'm not a friend of hiding exceptions, and assuming only jar files are "valid" libraries is also not right thing to do, there are multiple file formats that can be used as "libraries".

do not log stacktrace for ZipException - only log filename

I would be fine with that, assuming we produce error messages distinct enough to identify the exact place in compiler code that produces it.

jukzi commented 7 months ago

I'm not a friend of hiding exceptions, and assuming only jar files are "valid" libraries is also not right thing to do, there are multiple file formats that can be used as "libraries".

May be it's more safe to assume that a "*.properties.src" file is not meant to contain class files? Then it could be excluded from logging. @stephan-herrmann is it only that file name that makes a problem?

laeubi commented 7 months ago

I agree with @iloveeclipse that hiding the exception does not help much, Exceptions are .. exceptional and indicate there is a problem in the handling of classpathes, they key to solve this would be

  1. latest ECJ so one can see where the exception exactly happens (code already recently changed/improved over time)
  2. Give a reference to the JVM to one can check why this file is on the classpath and what it contains (maybe ECJ should indeed handle the file somehow instead of ignore it)
stephan-herrmann commented 7 months ago

Sorry for bogus line numbers and imprecise description of the problem.

Here's a simple reproducer using ecj-4.30RC1.jar:

$ echo "public class X {}" > X.java
$ java -jar ecj-4.30RC1.jar -1.8 -bootclasspath /home/java/jdk1.8.0/jre/lib/rt.jar -endorseddirs  /home/java/jdk1.8.0/jre/lib X.java
openjdk 20 2023-03-21
OpenJDK Runtime Environment (build 20+36-2344)
OpenJDK 64-Bit Server VM (build 20+36-2344, mixed mode, sharing)
Failed to init Classpath for jar file /home/java/jdk1.8.0/jre/lib/fontconfig.RedHat.5.properties.src
java.util.zip.ZipException: zip END header not found
        at java.base/java.util.zip.ZipFile$Source.findEND(ZipFile.java:1489)
        at java.base/java.util.zip.ZipFile$Source.initCEN(ZipFile.java:1497)
        at java.base/java.util.zip.ZipFile$Source.<init>(ZipFile.java:1335)
        at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1298)
        at java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:716)
        at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:243)
        at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:172)
        at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:186)
        at org.eclipse.jdt.internal.compiler.batch.ClasspathJar.initialize(ClasspathJar.java:204)
        at org.eclipse.jdt.internal.compiler.batch.FileSystem.<init>(FileSystem.java:235)
        at org.eclipse.jdt.internal.compiler.batch.Main.getLibraryAccess(Main.java:3504)
        at org.eclipse.jdt.internal.compiler.batch.Main.performCompilation(Main.java:4750)
        at org.eclipse.jdt.internal.compiler.batch.Main.compile(Main.java:1802)
        at org.eclipse.jdt.internal.compiler.batch.Main.main(Main.java:1521)
Failed to init Classpath for jar file /home/java/jdk1.8.0/jre/lib/hijrah-config-umalqura.properties
java.util.zip.ZipException: zip END header not found
...
many more of those
...

In the original case I believe it was tycho adding either -endorseddirs or -extdirs. That perhaps was a result of using toolchains, but that shouldn't matter.

File extensions encountered: .src, .properties, .bfc, .data, .dat, .txt, .ja.

stephan-herrmann commented 6 months ago

Exceptions are .. exceptional and indicate there is a problem in the handling of classpathes

These exceptions are expected. Catching those is necessary until someone comes up with a reliable replacement that detects zipfiles without trying to access them. At least for -endorseddirs and -extdirs there is no use in logging those exceptions, pretty please.

laeubi commented 6 months ago

Exceptions are .. exceptional and indicate there is a problem in the handling of classpathes At least for -endorseddirs and -extdirs there is no use in logging those exceptions, pretty please.

I'm not sure if it is feasible to assume that these directories contain anything that is not named .jar but is a jar for the classpath ... so I think the simplest solution here would be to filter all files not ending with .jar this also would prevent to access these files at all.

If one things that not enough, one could have a "blacklist" for -endorseddirs and -extdirs that e.g. do not try the knwon not jar extensions you listed above...

stephan-herrmann commented 6 months ago

I'm not sure if it is feasible to assume ...

The compiler should generally not "assume" anything other than what's specified in JLS. Unfortunately I'm not aware of a specification that tells us that all archives of .class files on the classpath must end with .jar.

Other than that it's impossible to know if any users out in the huge Java universe rely on the fact that ecj is able to read classes from an oddly named archive. Even if such use cases doesn't make sense to you and me, I don't want to take responsibility for causing potential breakage out there for little benefit on our end.

laeubi commented 6 months ago

Of course not all jars must end with jar, that is correct,If I look for example at javac option for extdirs it says:

Each JAR file in the specified directories is searched for class files. All JAR files found become part of the class path.

So it seems mandatory to check each file in the directory if it is a jar file before passing it further down as classpath entry to any internal code in the compiler.

Of course not all jars must end with jar, that is correct, so either the part that parses the -endorseddirs and -extdirs need to enhanced or one might think about making the classpath code more "smart" e.g. lets assume I'm that "power user" and have a jar that is corrupted and I name it my.lib now I put it on the classpath and get no message at all but of course something does not work, so I would want the compiler to tell me something is wrong with it (that's what the exception currently offers, a way to tell: Hey there might be something wrong but I don't know for sure) because otherwise I might be frustrated and think its a bug in JDT.

If we now want to extend this to that any file (not only jars) can be passed as classpath and this is assumed to be valid, then the I see the following options:

  1. assume the best - try reading as a jar and if that fails read the first bytes of the file and if they indicate a zip/jar header report it as an error, otherwise ignore the exception
  2. assume the worst - never try to read it as jar if we not before have read the first bytes and they indicate a zip/jar header

but always assume (and you said compiler should never assume anything) that any exception means we can ignore that and not report anything to the user just to silence such cases where non jar/class files are passed as classpath seems wrong.

jukzi commented 5 months ago

Did anybody try how javac treats JARs that have non normal file names - i.e. not named ".zip" or ".jar"?

laeubi commented 5 months ago

Did anybody try how javac treats JARs that have non normal file names - i.e. not named ".zip" or ".jar"?

As mentioned above javac documents on the extdir:

Each JAR file in the specified directories is searched for class files. All JAR files found become part of the class path

And Jar Specification says

A JAR file is essentially a zip file that contains an optional META-INF directory. [...]. There is no restriction on the name of a JAR file, it can be any legal file name on a particular platform.

So if one combines those two it can be any name but must be a zip and maybe contains a manifest...

jukzi commented 5 months ago

search for jar/zip only [1]

https://github.com/openjdk/jdk/blob/f2a4ed680b54e644ae83f8898a4e66a0c45c9cf4/test/langtools/tools/javac/Paths/MineField.java#L49

javac -endorseddirs "C:\Program Files\Java\jdk1.8.0_202\jre\lib" -bootclasspath c:\ -source 8 -target 8 A.java does not output any warning, see also BatchCompilerTest.test017b

wilx commented 3 months ago

Exceptions are .. exceptional and indicate there is a problem in the handling of classpathes

These exceptions are expected. Catching those is necessary until someone comes up with a reliable replacement that detects zipfiles without trying to access them. At least for -endorseddirs and -extdirs there is no use in logging those exceptions, pretty please.

Can't we use the magic numbers? https://en.wikipedia.org/wiki/ZIP_(file_format) All we need is first 4 bytes.