marschall / memoryfilesystem

An in memory implementation of a JSR-203 file system
282 stars 36 forks source link

Support URLs #142

Closed ascopes closed 1 year ago

ascopes commented 1 year ago

Hi!

Noticed the library doesn't currently support URL handling. I am working on a project where this library would be ideal, but unfortunately I need to be able to use URLs for classloading (since it will be providing a framework to test compilation of code via javac with annotation processors, etc, and some annotation processors like manifold unfortunately make heavy use of URL class loaders internally to work correctly).

I've seen that the URL support is listed as being "unsupported" in the README (although the actual link to the documentation explaining why this is unsupported is dead, so I can't read through that).

If I was willing to work on implementing this, would it be a feature you'd consider merging? If so, I am happy to try and find time to work on this and test it. Other libs like JIMFS (which I currently use for this project) support this feature, so it would be a case of studying how the default file manager in Java, and libraries like JIMFS achieve integrating with URL providers to work out how to deal with this for this project.

Thanks :-)

marschall commented 1 year ago

Internet Archive to the rescue: https://web.archive.org/web/20120424174653/http://www.unicon.net/node/776

In short you need to either:

both of these affect all code running in the same JVM. Are either of these an option for you?

Can you go into a bit more detail into what you're ultimately trying to achieve? If it's just a class loader have you had a look at path-classloader? If you're looking for in-memory compilation have you had a look at path-file-managet?

ascopes commented 1 year ago

Hey, thanks for the reply. Only just seen this, sorry.

So my project (https://github.com/ascopes/java-compiler-testing) provides a JSR-199 JavaFileManager implementation internally that likely covers a lot of similar stuff to your project. The main difference is that the implementation exposes other details like the file tree hierarchy as to enable performing assertions and other bits and pieces on the results of Javac compilations. This defaults to running on a file system in memory to enable integration testing code for stuff like annotation processors, compiler plugins, code generation, etc.

I had previously made a ClassLoader as part of the file manager impl that purely handled class loading based upon the Path API, but came across issues in acceptance testing where some annotation processor libraries make assumptions about the classloader in use during compilation being a subclass of URLClassLoader. This shouldn't really be an assumption people are making, but the classloader itself is a URLClassLoader by default when running javac or ecj, so I can see why this assumption was made.

The issue with this is, of course, that to support using a URLClassLoader to more closely mimic how javac works normally, I need to be able to use URLs on the in memory file system.

Right now, I am using jimfs for this, as it provides URL bindings to the FileSystem SPI they define. My main problem is that jimfs currently is using a version of guava that has CVEs associated with it, and jimfs hasn't had a release for over a year now... which is why I am reassessing whether a different lib can provide the same functionality I need.

That being said... I am trying to work out where JIMFS actually provides this functionality. It seems to be that the only thing actually provided is the scheme binding itself on the FileSystem provider. Now starting to wonder if this is not an issue for me at all. JIMFS definitely is resolving URLs correctly for me here: https://github.com/ascopes/java-compiler-testing/blob/main/java-compiler-testing/src/main/java/io/github/ascopes/jct/containers/impl/PackageContainerGroupUrlClassLoader.java#L48. If needed, I guess I could provide the URL stream handler within my lib using the second option like you mentioned, if not. First option probably wouldn't be ideal as tests are run from junit-jupiter, so it isn't something I could provide transparently.

Edit: found it. Looks like https://sourcegraph.com/github.com/google/jimfs/-/blob/jimfs/src/main/java/com/google/common/jimfs/Handler.java?L56 provides this mechanism, and it does what you described in the first option you gave. Seems this works dynamically at runtime though by just appending the system property. Does this sound like something that would cause issues if it were added by default? The only real issue I can think of from doing this is that OSGi-based applications might encounter some issues with visibility, but that would be the case anyway I would have thought since the FileSystem objects that are registered appear to be global to the running JVM as well.

What do you think?

marschall commented 1 year ago

I'll look into it and report back.

marschall commented 1 year ago

I released 2.6.0 which adds URL support and did some testing. The changes are here https://github.com/ascopes/java-compiler-testing/pull/450. It's unfortunately not a drop-in replacement. There are some subtle semantic differences.

ascopes commented 1 year ago

Brilliant, thanks for doing all of that @marschall, greatly appreciated!

Will take a look on my side as well and get back to you.

ascopes commented 1 year ago

@marschall on my side, most of it is working now. Got one issue which seems to be that ClassLoaders are getting a bit confused with the URLs being provided. By the looks, URLs that represent a directory are being expected to be passed back with a trailing backslash to denote that they are directories. It doesn't look like MemoryFileSystem is doing that implicitly though, which is leading the JDK classloader logic to consider the URLs being provided as being for a JAR rather than a file system tree.

Code that seems to be triggering the issue on the JDK side is at https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/share/classes/jdk/internal/loader/URLClassPath.java#L471.

The result on the ClassLoader level appears to be

java.lang.ClassNotFoundException: org.example.User
    at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:588)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
    at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321)

Is this something you might be able to double check at some point? I don't understand the full ins and outs of how this works internally in MemoryFileSystem unfortunately.

A build that reproduces this issue can be found at https://github.com/ascopes/java-compiler-testing/pull/451 if that is any additional help. Other than this small issue, and the fix I put up at #144 , I think everything is working. Thanks for the work that you put into this!