dirs-dev / directories-jvm

a tiny library that provides config/cache/data paths, following the respective conventions on Linux, macOS, BSD and Windows
https://dirs.dev
Mozilla Public License 2.0
237 stars 25 forks source link

Deal with Windows support being an ongoing shit show #49

Open soc opened 3 years ago

soc commented 3 years ago

Many contributors have spent heroic efforts to keep Windows support working, for which I'm greatly thankful.

Though it appears to me that Windows support keeps breaking time and time again – perhaps it's time to think whether a different approach could be less painful and more respectful to the time & efforts of contributors?

I'm short on actual ideas though:

I'm open to suggestions, thoughts, ideas, etc. – what do people (@alexarchambault, @eatkins, @fthomas. @phongngtuan, ...) think?

phongngtuan commented 3 years ago

pardon my ignorance but what is the issue with System.getenv("APPDATA") and System.getenv("LOCALAPPDATA") ?

alexarchambault commented 3 years ago

I think embedding a tiny JNI library could work too, as we're only calling simple system calls. I had started toying with that some time ago. IIRC, the main hurdle I ran into was charset conversions, from what SHGetKnownFolderPath gives us, to something JNI accepts as a string.

alexarchambault commented 3 years ago

@phongngtuan It's discussed below https://github.com/dirs-dev/directories-jvm/issues/26#issuecomment-570288952, these environment variables are not always in sync with the system call, which returns the true values.

alexarchambault commented 3 years ago

About the charset issue I mentioned above, maybe wcstombs and NewString would just work...

eatkins commented 3 years ago

I feel strongly that precompiled native code with fallback to shelling out is the way to go but I also understand why @soc has skepticism. My experience is that while JNI is definitely a pain and does come with risks, it can be incorporated successfully. For sbt, I wrote a library that speeds up recursive directory listing by bundling pre-compiled code for some platforms. If the library is unable to load the native code for any reason it falls back to jvm built-ins. This code shipped with sbt 1.3.0 and there has only been one issue that came up in pre-release with the windows code: https://github.com/sbt/sbt/issues/4690. Perhaps unsurprisingly given @alexarchambault's comment above, it was also related to Window's use of wide characters by default: https://github.com/swoval/swoval/commit/e425e4c867ce24247da6a53365073849aa44d8d7 and was basically resolved by switching from NewStringUTF to NewString.

Loading a small dll and running it is generally much faster than shelling out to a process. There are certainly safety issues with using c but I think for this specific use case they are minimal because naively it seems as though the JNI code would just make a single system call and return a java value back to the jvm. This means that it wouldn't need to use the heap, mitigating one of the biggest worries with native code.

I am not personally interested in working on this but I would be happy to share my advice and review any changes. I have spent a lot of time incorporating native code into jvm libraries. In addition to swoval, I also added jni support for unix domain sockets and windows named pipes so that we could use them in a graalvm native image for the sbt thin client (JNA uses reflection in a way that the graalvm could not handle): https://github.com/sbt/ipcsocket/pull/8. There are a few avoidable gotchas if the project decides to go down that road.

alexarchambault commented 3 years ago

So I have this working. I also manually checked that it works with non-ASCII characters.

I think I'm going to try to have it built and published from GitHub actions, and have ProjectDirectories.fromPath accept a custom getWinDirs method as input, so that I can have it use this library, and see how it works.

alexarchambault commented 3 years ago

For context, I think the JNI / Powershell approach is slightly better than just reading env vars, but more importantly, it should be useful in other places in coursier (terminal-related stuff, Windows env var stuff), which is why I'm trying to stick to it.

soc commented 3 years ago

@eatkins My main concern is that I'd really like to avoid having yet-another layer on top of the already existing layers that can fail (or not apply due to obscure reasons/on obscure platforms). I'd like to have something that reduces the existing complexity while increasing the reliability.

I'll be migrating this library to the new Java FFI as soon as Project Panama ships, but in the meantime I think a stop-gap solution that is less brittle than what we currently have is direly needed.

matejsarlija commented 3 years ago

https://i.imgflip.com/5refek.jpg (had to do this, this issue pops up on every Scala tool).

soc commented 2 years ago

@RayKoopa welcome!

RayKoopa commented 2 years ago

Hey there, crossposting my thoughts on this. The scenario here seems to be to remove the PowerShell invocation as it is causing trouble. These options seem viable, which you've mentioned in part above:

I don't know security implications of JNI as I'm not a Java developer, but since you effectively call a C WinAPI method, the call itself wouldn't be more or less secure than doing it in .NET.

If you have considered using .NET's base library method System.Environment.GetFolderPath, I have to disappoint you:

Given this, I'd personally recommend just using JNI / C for this. It seems to be the natural thing Java does for native OS-specific API calls anyway, I presume?

If you still want to P/Invoke in .NET, you may like my CodeProject article on that. As mentioned in the dotnet issue, your call in the PS script seems effectively correct, though you can / should use automatic string marshaling to not have to deal with manually converting the returned buffer to a .NET string, and freeing it in all cases afterwards.

Also, a side note on the "Public" folder: On Windows, this folder is per-system, not per-user. It also has subfolders for Documents, Downloads, Pictures, etc., but these subfolders can be redirected, even outside of the parent "Public" folder. If a user wants to store a "public document" and expects it to be in Public > Documents, they would have to explicitly query the "PublicDocuments" path, since appending "\Documents" to the public path is out of the question due to the redirection scenario.

soc commented 2 years ago

@RayKoopa Thank you! My concern with JNI/C is that I have to maintain/keep/compile a piece of code for every platform Windows runs on. I own a Windows license for exactly zero platforms, and cross-compiling with C seems to be painful as well.

Which brings us to using C (compilers), which I'd rather just not. "It can be written safely" is probably true for this use-case, but too many people making this verdict in too many situations is exactly what brought computing into disrepute.

I'm starting to wonder whether the registry values are generally reliable. So Fonts may not exist, but on the other hand even Downloads is in there. I wonder ...

RayKoopa commented 2 years ago

Hmm I see. I hoped there would be some kind of interoperability offered by JNI that would not require to compile raw C and introduce lots of complexity to your project; kinda like P/Invoke in .NET. EDIT: I just found out about JNA, example usage. This looks like somebody already wrapped it for you quite nicely. /EDIT

Maybe you can find another way; effectively you "just" need to do two standard C calls made by something preinstalled playing together better with Java, and convert the returned memory buffer storing a 0-terminated UTF16 LE string to a Java String. πŸ€”

As you already know, the registry method is unsupported. In practice, to my experience, at very weird occassions the keys do not exist or can be outdated / wrong. Required reading would be https://devblogs.microsoft.com/oldnewthing/20031103-00/?p=41973 and https://devblogs.microsoft.com/oldnewthing/20110322-00/?p=11163 . It may however be "better" to risk "possibly" incorrect results rather than getting no results due to PS invocation failing (which I know nothing about sadly). Your mileage may vary. I just checked the registry key on my system: It does not list the "Public" folder you are seeking support for, nor for the subfolders included in it. So the registry solution is not only unsupported, it is also insufficient.

eatkins commented 2 years ago

@soc your concerns about maintainability are quite well warranted and I am sympathetic. I can't remember if this came up on other threads but is there a reason why JNA is not considered an option here? I personally kind of hate JNA and would rather just write a JNI library but I think it may alleviate the cross compilation problem. Adding a dependency would be unfortunate but sbt, which I'm guessing is the biggest consumer of this library, already depends on the JNA anyway.

Either way, I will reiterate that I don't think the JNI solution is as bad as it may seen. Microsoft for all its many flaws religiously preserves backward compatibility so you just would need a system for compiling the binary once and then checking it in (though I will acknowledge that checking binaries in to a git repo is icky). I have had success cross-compiling jni libraries with mingw, which is available for mac and linux, for x86_64 though I have never tried it for an arm platform. It also would be possible to compile on a CI platform like appveyor (I have built and distributed binaries using appveyor) or github actions (I haven't actually tried this but would assume that github actions would have a mechanism for exporting artifacts). It would indeed be unreasonable for you to be expected to maintain code for platforms that you cannot easily test so you could reasonably draw a line in the sand that any supported windows platform must have freely available cloud vms to build any needed binaries.

I am very sympathetic to the desire to avoid JNI or any windows specific building. I don't have a windows license either. The way I handled it was to install windows on virtualbox, which does not require a paid license. It was never pleasant but it worked. I also agree with you ideologically that c is terrible but if the only api the OS provides is a c api, I'm not sure what option there is but to submit to working directly with that api.

eatkins commented 2 years ago

I am going to unsubscribe from this thread. Every time there is a comment, it is like a mini ddos attack on my brain. It is frustrating that from my perspective the only viable solution is being rejected for ideological reasons. This is a disservice to the downstream users of the library who are affected. If someone needs help implementing a jni solution, I have experience and would be happy to offer assistance. Please email me directly and don’t @ me here.

soc commented 2 years ago

is there a reason why JNA is not considered an option here

Sadly JNA is rather big, it would turn this < 10kB library into a 3.7MB one. :-/ (see https://github.com/dirs-dev/directories-jvm/issues/16)

alexarchambault commented 2 years ago

If ever people stumble upon this issue, this is addressed in coursier, via JNI, by:

coursier/directories-jvm isn't published on Maven Central. coursier uses it as a source dependency, then shades it. So if you depend on coursier, you can access it via coursier.cache.shaded.dirs.ProjectDirectories, and get the from method accepting a fourth parameter.

I didn't test it on Windows ARM64, so I have no idea how it works there (but coursier has a fallback to former powershell stuff in that case).

soc commented 2 years ago

@alexarchambault good work! Do you know how large the binaries turned out?

I'm experimenting with an approach in dirs-cli and end up with 30KiB (16KiB with upx). Though I still need to replace the Unix-only File::from_raw_fd(1) with WriteConsole for Windows.

Another thought is building an x86 binary and using that on both x86-64 and x86. (Not sure what's the situation on ARM...)

DavidGregory084 commented 1 year ago

@soc If it helps, you can implement JNI functions in Rust without too much trouble (see e.g. this code here) so you could call through to your directories-sys-rs crate rather than implementing something in C, and it would then support the same platforms as your Rust code does? AFAIK the approach that folks usually use to deploy these is to bundle the dylib for every platform into the resources of the JAR file, then extract the one for the current platform into a temp directory and load it using System.load at static initialization time, although I have never implemented anything like this myself. I can't think of a library example off the top of my head but protoc-jar does something similar.

soc commented 1 year ago

@DavidGregory084 Interesting, thanks!

DavidGregory084 commented 1 year ago

@soc this library might help with the loading native libs part.

I found that ZeroMQ uses this library to load a bunch of libraries in its JNI bindings: https://github.com/zeromq/czmq/tree/master/bindings/jni/czmq-jni.

That project uses gradle to create the final JAR file in the structure expected by the native-lib-loader.

soc commented 10 months ago

@soc this library might help with the loading native libs part.

Thanks @DavidGregory084, I'll have a look!

soc commented 10 months ago

I have some experimental code using the new FFI API of Java 22.

That sadly doesn't help all those you are going to be stuck on Java < 22 for the next few years, but perhaps it can serve as a "known good" implementation.

brcolow commented 7 months ago

I posted this in another PR but this may be more a appropriate place:

I made a proof-of-concept using Java 22 Foreign Function & Memory API of how to extract a LocalAppData (as an example) known folder id, in case it is helpful for you.

https://gist.github.com/brcolow/e6c2e59a3aa29d32d3332bcf10313031

soc commented 7 months ago

Hi @brcolow, thanks for the code! That looks way more fleshed out than my efforts. I'll give it a go next week.

soc commented 3 months ago

Hey everyone, does the output on Windows here look sensible?

<snip>
UserDirectories (Windows Server 2022):
  homeDir     = 'C:\Users\runneradmin'
  audioDir    = 'C:\Users\runneradmin\Music'
  fontDir     = 'null'
  desktopDir  = 'C:\Users\runneradmin\Desktop'
  documentDir = 'C:\Users\runneradmin\Documents'
  downloadDir = 'C:\Users\runneradmin\Downloads'
  pictureDir  = 'C:\Users\runneradmin\Pictures'
  publicDir   = 'C:\Users\Public'
  templateDir = 'C:\Users\runneradmin\AppData\Roaming\Microsoft\Windows\Templates'
  videoDir    = 'C:\Users\runneradmin\Videos'

BaseDirectories (Windows Server 2022):
  homeDir       = 'C:\Users\runneradmin'
  cacheDir      = 'C:\Users\runneradmin\AppData\Local'
  configDir     = 'C:\Users\runneradmin\AppData\Roaming'
  dataDir       = 'C:\Users\runneradmin\AppData\Roaming'
  dataLocalDir  = 'C:\Users\runneradmin\AppData\Local'
  executableDir = 'null'
  preferenceDir = 'C:\Users\runneradmin\AppData\Roaming'
  runtimeDir    = 'null'

ProjectDirectories (Windows Server 2022):
  projectPath   = 'Baz Corp\Foo Bar-App'
  cacheDir      = 'C:\Users\runneradmin\AppData\Local\Baz Corp\Foo Bar-App\cache'
  configDir     = 'C:\Users\runneradmin\AppData\Roaming\Baz Corp\Foo Bar-App\config'
  dataDir       = 'C:\Users\runneradmin\AppData\Roaming\Baz Corp\Foo Bar-App\data'
  dataLocalDir  = 'C:\Users\runneradmin\AppData\Local\Baz Corp\Foo Bar-App\data'
  preferenceDir = 'C:\Users\runneradmin\AppData\Roaming\Baz Corp\Foo Bar-App\config'
  runtimeDir    = 'null'
<snip>
brcolow commented 3 months ago

Looks good to me.

soc commented 3 months ago

Anyone interested in reviewing the code, it is here: https://github.com/dirs-dev/directories-jvm/pull/61

tliechti commented 2 months ago

Anyone interested in reviewing the code, it is here: #61

Really appreciate your work. Sadly, I can not use Java 22. So I did a quick hack to omit powershell.exe and calling pwsh.exe only while omitting the -v -2 params. That worked for me with pwsh >7.4.3.

soc commented 2 months ago

That generally might or might not work depending on various Windows Defender settings.

karianna commented 2 weeks ago

Hi folks, I run the Java Engineering and Golang Group at Microsoft, but I'll stress that I'm here in a personal capacity :-) (this isn't strictly related to the Microsoft Build of OpenJDK).

I suspect I'm missing a ton of nuance, but it sounds like that this project is unable to use System.getenv("APPDATA") and System.getenv("LOCALAPPDATA") because those environment variables are not consistently available, (in part due to changes of how you're allowed to read from the registry) across all versions of Windows, or if you invoke a Java program on Windows in a particular way (e.g., Run As)?

JNA and JNI aren't desirable because you're looking fro a pure Java solution?

Lastly, using FFI in Java 22+ does work but that requires users to have Java 22+ installed, which not all users can do. (Note with Java 22, you can jlink / jpackage the Java runtime modules with the application modules, but that would create a large bundle which I suspect is also not desirable).

Have I got that all correct?

brcolow commented 2 weeks ago

@karianna That is a good summary in my opinion. I am the one who made the prototype for using FFI in Java 22 as I thought it was the best long-term solution. Of course it has drawbacks for backwards compatibility :).

soloturn commented 2 weeks ago

would JNA, to give some example: https://github.com/harawata/appdirs/blob/master/src/main/java/net/harawata/appdirs/impl/ShellFolderResolver.java , be an appropriate 2nd option for the time beeing, @karianna @brcolow , as the problem mentioned above with JNA was with graalvm - which anyway compiles it native - so no problem with ffi? is it not also a security challenge to have FFI code, as users do not know really what is inside, therefor the warning ?

WARNING: A restricted method in java.lang.foreign.AddressLayout has been called
WARNING: java.lang.foreign.AddressLayout::withTargetLayout has been called by dev.dirs.impl.Windows in an unnamed module
WARNING: Use --enable-native-access=ALL-UNNAMED to avoid a warning for callers in this module
WARNING: Restricted methods will be blocked in a future release unless native access is enabled

maybe some background why i bumped into this ticket. when packaging terasology for windows in chocolatey the app failed to start. as healing i made pull request https://github.com/MovingBlocks/Terasology/pull/5284 using dir-env to get the correct folders on windows. the folders we could then overwrite with environment variable XDG* when testing with multiple clients on one box, and ideally properties (#62) so it is easier for unit tests. as since java-17 changing environment variables out of a running java process is not obvious. in the code review @benjaminamos pointed out that he prefers JNA as more robust than what dir-env currently does.

soc commented 2 weeks ago

Hi @karianna,

expanding on @brcolow a bit:

this project is unable to use System.getenv("APPDATA") and System.getenv("LOCALAPPDATA") because those environment variables are not consistently available

the problem is more that these env variables (as well as the registry keys) have no meaning.

Same for System.getenv("PROFILE") or System.getenv("HOME") that some people want to use.

Using those would mean inventing some proprietary handshake that would work if everyone used this library, but would be completely separate from the actual reality – that is solely defined by the KnownFolders API.

JNA and JNI aren't desirable because you're looking fro a pure Java solution?

Lastly, using FFI in Java 22+ does work but that requires users to have Java 22+ installed, which not all users can do.

Yep, though that problem gets smaller by the minute. And those who don't upgrade their machine could also be fine with the unholy PowerShell mess that previously powered this library.

Hope that helped!

karianna commented 2 weeks ago

So I think the longer term solution is definitely FFI. @brcolow - I know the folks in OpenJDK would love to hear of any feedback on your experience there so far (openjdk.org has a project panama mailing list).

I can chat to my .NET colleagues at Microsoft and see what they've done in .NET core to resolve this type of thing, there might be an alternative way to approach this. However, I'm going to guess that they'll suggest going down the JNI path since getting the Windows team to add a new registry entry / way of accessing these folders in a consistent manner across all versions of Windows would be extremely challenging.

So shorter term, your bet best is JNI. I appreciate that causes some build platform pain, which brings me to asking how are the builds produced today, through GitHub Actions or something else?

soloturn commented 2 weeks ago

JNA and JNI aren't desirable because you're looking fro a pure Java solution?

* JNA would be a 30x size increase of this library.

to get windows support stable, that would be perfectly fine for a user of the library my guess would be. for plantuml, uml drawing software, we solved the size challenge by offering two binaries, one small, one big which included the pdf generation, if you search for pdf or pdfJar in the gradle build file. would such an approach be thinkable here as well @soc ?

sideeffffect commented 2 weeks ago

I, personally, would prefer something which is

Does the JNA approach satisfy these criteria? If yes, I wouldn't mind the increase in size at all.

soc commented 2 weeks ago

Does the JNA approach satisfy these criteria?

No. It's native code, it's "prepackaged JNI".

soc commented 2 weeks ago

I can chat to my .NET colleagues at Microsoft and see what they've done in .NET core to resolve this type of thing, there might be an alternative way to approach this.

No need to ask: They use P/Invoke, and their implementation has been missing folders for years, which is the reason we did the PowerShell β†’ .NET β†’ P/Invoke dance in the first place.

karianna commented 2 weeks ago

I think JNI could be a middle ground here (should keep the package small). If it's just platform testing then I might be able to help secure some test platforms

soc commented 2 weeks ago

@karianna Thanks, though I don't think I want to consider JNI at this point.

It could have been useful 5 years ago, or before suffering the PowerShell mess; not now when we already suffered through another rewrite (the Java FFI one) a few weeks ago.

Not actually sure why this discussion popped up again – this issue is solved from my point of view. The only work items I thought I have left is merging https://github.com/dirs-dev/directories-jvm/pull/61 and publishing a new version on Maven Central.

sideeffffect commented 2 weeks ago

this issue is solved from my point of view. The only work items I thought I have left is merging https://github.com/dirs-dev/directories-jvm/pull/61 and publishing a new version on Maven Central.

The problem then is that it requires JVM >= 22, which may be problem for many of the current users of the library. Ideally the base JVM would 8, or at least 11, to cover wider user base. Other approaches (like JNA) might be better from this perspective. Or at least have multi-release JAR, which would have fallback for JVM < 22.

soc commented 2 weeks ago

@sideeffffect As mentioned using the version that fits your JVM version requirement is fine.

I'm not seeing how spending more time on this topic will lead to the discovery of a magic solution that has eluded us all for the last 5 years.

soloturn commented 2 weeks ago

Not actually sure why this discussion popped up again – this issue is solved from my point of view. The only work items I thought I have left is merging #61 and publishing a new version on Maven Central.

can i help doing the release as you are sure it is best approach? how do you propose changing such variables in unit tests @soc because changing environment variables is a little tricky?

soloturn commented 1 week ago

@karianna would you be able to share a HelloWorld ? is "loadLibrary" broken or we tried to use it wrong?

public class HelloWorld {
    static {
        System.loadLibrary("combase");
    }

    public native void sayHello();

    public static void main(String[] args) {
        HelloWorld helloWorld = new HelloWorld();
        helloWorld.sayHello();
    }
}
$ java --version
openjdk 23.0.1 2024-10-15
OpenJDK Runtime Environment Zulu23.30+13-CA (build 23.0.1+11)
$ javac HelloWorld.java
$ java HelloWorld
Exception in thread "main" java.lang.UnsatisfiedLinkError: 'void HelloWorld.sayHello()'
        at HelloWorld.sayHello(Native Method)
        at HelloWorld.main(HelloWorld.java:12)

@FriendSeeker tested for sbt https://github.com/sbt/sbt/issues/7833 and got it to work with hard coding the path with getting an environment variable. the whole exercise in this ticket was to avoid getting an environment variable as it was not stable.

      String sysdir = System.getenv("WINDIR") + "/system32/";
      System.load(sysdir + "combase.dll");
      System.load(sysdir + "ole32.dll");
      System.load(sysdir + "shell32.dll");