codehaus-plexus / plexus-archiver

https://codehaus-plexus.github.io/plexus-archiver/
Apache License 2.0
44 stars 48 forks source link

JarToolModularJarArchiver uses the manifest's main class when, in some cases, perhaps it shouldn't #310

Open ljnelson opened 11 months ago

ljnelson commented 11 months ago

For reference: https://github.com/codehaus-plexus/plexus-archiver/blob/plexus-archiver-4.9.0/src/main/java/org/codehaus/plexus/archiver/jar/JarToolModularJarArchiver.java#L210

Scenario: we had a Maven build of a non-modular project whose pom.xml features a <mainClass> element in the <manifest> portion of the archive configuration. Let's say the main class in question is com.foo.other.Main.

Now we add a module-info.java to the project, thus making it a modular project. Let's say this project exports com.foo. (Clearly it cannot export or contain com.foo.other because then you'd have a package split.)

Even if the project housing com.foo.other.Main is modular (in the, say, com.foo.other module), and even if the new module-info.java requires com.foo.other, when the Plexus archiver goes to update the jar file into a modular jar file, the jar --update operation will fail:

jar: Package com.foo.other missing from ModulePackages class file attribute

As far as I can tell, this is because the manifest's main class is used as if it were this module's main class, i.e. as if it were "owned" by this module.

My guess is the jar tool's update operation is adding in the ModulePackages classfile attribute into the existing module-info.class, as it should, computed from the known packages of the current project's module-info.class. You can see that here: https://github.com/openjdk/jdk/blob/a1e7a302c8a3d7a1069659653042476b20becabe/src/java.base/share/classes/jdk/internal/module/ModuleInfo.java#L306-L334

This low-level jar error is the only indication that the main class specified in the pom.xml is incorrect in some way. As I mentioned above, I think that it is incorrect because, I guess, a main class whose name will be stored in a module-info.class file, must (again, a guess) be "from" the module in question. In this example, it is not.

Perhaps—and I'm not an expert here—if the JarToolModularJarArchiver realizes that there is no module main class set, it should not fall back on the manifest main class, or at least should see if the manifest main class is from an "OK" package.

I believe it is incorrect for a module descriptor's binary format to encode a main class that does not belong to the module, but as I said I'm not really an expert here.

Thanks for all the work you do to support Maven and the open source Java community. ❤️

plamentotev commented 11 months ago

Hi,

Would you share more details about the specific use case? Is my understanding correct that you want to have mainClass set in the manifest, but not in the module descriptor?

If I recall correctly (but some time has passed so I could be wrong), the decision to use the mainClass from the manifest was based on that it works out of the box and that both the manifest and the descriptor would point to the same main class.

ljnelson commented 11 months ago

The "use case" is an example project: you, the user, receive an (executable-jar-producing) example project that is not modular. You discover that some of its dependencies are modular, so you think, ah, I will go Full Java Module System™! It is The Future™! So you add a module-info.java, and now you can't execute your project anymore.

(That's the user experience.)

So yes, right before this user adds a module-info.java, she already has a Main-Class set in the manifest, and her project works. After she adds a module-info.java, her project no longer works.

Again, this is because the Main-Class identifies a class from another jar. In such a case, you cannot embed its name in the ModuleMainClass attribute in the module-info.class class file, because the class in question does not belong to the module identified by the module descriptor.

plamentotev commented 11 months ago

Thanks for the details, makes sense.

I wonder if we can do such check without reading the module-info.class file. My concern with reading file is that they add maintenance cost as we need to update plexus-archiver anytime the class format is updated.