JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.58k stars 5.47k forks source link

Permissions issues with artifacts on Windows in 1.10-rc1 (and betas) - Access is denied #52272

Open nilshg opened 10 months ago

nilshg commented 10 months ago

This is a somewhat speculative issue without a reproducer - I'm filing this as I've run into this a few times now and have had sporadic interactions with people on Discourse/Slack seeing this as well, so hopefully this consolidates things by giving those who experience this issue something to search for.

What prompted me to file today was seeing this in one of my environments:

  64 dependencies successfully precompiled in 360 seconds. 266 already precompiled.
  1 dependency had output during precompilation:
┌ GR
│  ERROR: LoadError: InitError: could not load library "C:\Users\ngudat\.julia\artifacts\52bbefbea6a9098fa5c57208812d3868fac90841\bin\libxml2-2.dll"
│  Access is denied.
│  Stacktrace:
│    [1] dlopen(s::String, flags::UInt32; throw_error::Bool)
│      @ Base.Libc.Libdl .\libdl.jl:117

I had previously posted on Discourse about this here:

https://discourse.julialang.org/t/windows-artifact-issue-access-denied-when-installing-plots-makie/103734/4

with others chiming in reporting the same issue. There's also an issue reported with Yggdrasil here:

https://github.com/JuliaPackaging/Yggdrasil/issues/7625

All of this is a bit of a throwback to this old issue from 1.5 days:

https://github.com/JuliaLang/julia/issues/38411

(and again with what I've seen on 1.10 again starting Julia as admin was fine, but then required to always run as admin when wanting to use any packages loading artifacts).

If anything changed in how artifacts get handled on Windows it might be worth looking into. Equally if no one else chimes in here reporting any issues I might just be seeing a hangover effect from artifacts installed earlier and the problem is solved on rc1 already.

nilshg commented 7 months ago

This seems to have become more widespread?

https://discourse.julialang.org/t/access-denied-for-plotting-artifacts/110570/2

https://discourse.julialang.org/t/given-up-installing-any-plotting-packages/110272/24

IanButterworth commented 7 months ago

I believe this is the corresponding Pkg issue https://github.com/JuliaLang/Pkg.jl/issues/3269

giordano commented 7 months ago

I believe this is the corresponding Pkg issue https://github.com/JuliaLang/Pkg.jl/issues/3269

I'm not sure, the error reported here has been observed only on Windows and only on Julia v1.10+, the Pkg issue you linked is on Linux and it started already in Julia v1.8.

I feel like this may be related (again) to https://github.com/JuliaLang/Pkg.jl/pull/3349

nilshg commented 7 months ago

Is there anything I can do to help with this? I'm currently not affected by the issue but - at risk of sounding annoying - this is the sort of issue that really puts off the typical corporate engineering/data science user that is usually on Windows and checking out Julia, only to find that they can't use any package that has an artifact somewhere in its dependency chain (which is almost surely at least one package for a new user).

I appreciate that the Venn diagram between "can contribute to Julia internals" and "uses Windows" is pretty much empty, but I'm happy to try out stuff if there is someone in the first part of the Venn diagram who has ideas about how this could be tackled.

StefanKarpinski commented 7 months ago

In the linked issue, @staticfloat wrote that this was happening because Pkg is installing things read-only. Do dlls need to be writeable on Windows or something?

staticfloat commented 7 months ago

Do dlls need to be writeable on Windows

I don't believe that's the case. Without some kind of reproducer it's going to be very difficult to track down what's happening here.

ViralBShah commented 7 months ago

Do any of these reproduce? They seem to have specific instances of failing packages.

https://github.com/JuliaPackaging/Yggdrasil/issues/7625 https://github.com/JuliaLang/julia/issues/53139

IanButterworth commented 7 months ago

Can we add a stat of the dll to the dlopen error message?

visr commented 7 months ago

EDIT: hiding this comment since there is no difference between the artifacts, because I copied the artifact dir over without NTFS permissions, leading in the wrong direction.

I have no reproducer because they happen rarely, but I did post the stat of such a DLL in https://github.com/JuliaPackaging/Yggdrasil/pull/7412#issuecomment-1731613331. When I posted that stat I thought the read-only mode was the problem because if I removed the artifact and reinstalled it somehow wasn't read-only anymore. But now with 1.10.1 I see all my artifacts are read-only and they usually work fine.

I did save the bad artifact dir of the ICU_jll@69.1 in question at the time. I put a good version I just downloaded next to it and uploaded it temporarily here. Can anyone spot any differences? To me it looks like the stat.mode and binary contents are identical, but if I put the bad artifact in .julia/artifacts I always get this on using ICU_jll:

ERROR: InitError: could not load library "C:\Users\visser_mn\.julia\artifacts\2a29863b092214f3e985df7ccf7601ae41d8f406\bin\icuin69.dll"
Access is denied.
Stacktrace:
  [1] dlopen(s::String, flags::UInt32; throw_error::Bool)
    @ Base.Libc.Libdl .\libdl.jl:117

And if I put the good one, everything works fine.

disberd commented 7 months ago

Not necessarily a reproducer, as this problem does not seem to appear in all windows machines. But on my machine, there is an issue always encountered with PlotlyKaleido since using julia 1.10 that depends on the artifact folder of Kaleido_jll being read-only.

This means that on windows I can't even cd into that folder:

julia> using Kaleido_jll

julia> cd(Kaleido_jll.artifact_dir)
ERROR: IOError: cd("C:\\Users\\Alberto.Mengali\\.julia\\artifacts\\7914a56da888d6a06d00c87f97e873c60e97acc7"): permission denied (EACCES)
Stacktrace:
 [1] uv_error
   @ .\libuv.jl:100 [inlined]
 [2] cd(dir::String)
   @ Base.Filesystem .\file.jl:91
 [3] top-level scope
   @ REPL[4]:1

This is likely what causes the problems with running the kaleido library as on windows this relies on calling a script called kaleido.cmd located in the artifact folder.

The contents of this script are quite simple:

@echo off
setlocal
chdir /d "%~dp0"
.\bin\kaleido.exe %*

And when trying to execute that, the following error messages are thrown:

julia> let path = joinpath(Kaleido_jll.artifact_dir, "kaleido.cmd")
           run(`$path`)
       end
Access is denied.
The system cannot find the path specified.
ERROR: failed process: Process(`'C:\Users\Alberto.Mengali\.julia\artifacts\7914a56da888d6a06d00c87f97e873c60e97acc7\kaleido.cmd'`, ProcessExited(1)) [1]

Stacktrace:
 [1] pipeline_error
   @ .\process.jl:565 [inlined]
 [2] run(::Cmd; wait::Bool)
   @ Base .\process.jl:480
 [3] run(::Cmd)
   @ Base .\process.jl:477
 [4] top-level scope
   @ REPL[5]:2

The first error Access is denied is triggerd by the chdir /d "%~dp0" line in the script above, which fails to change directory to the artifacts dir, and after failing that, the execution of .\bin\kaleido.exe %* failes to find the executable as the path is relative and the folder is wrong.

Edit: Adding output of versioninfo()

julia> versioninfo()
Julia Version 1.10.1
Commit 7790d6f064 (2024-02-13 20:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 20 × 12th Gen Intel(R) Core(TM) i7-12800H
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, alderlake)
Threads: 20 default, 0 interactive, 10 GC (on 20 virtual cores)
Environment:
  JULIA_PKG_USE_CLI_GIT = true
StefanKarpinski commented 7 months ago

Why wouldn't it be possible to cd into a read-only directory?

giordano commented 7 months ago

Maybe it isn't executable? If that's the case and also the library isn't executable, that'd explain the issue

visr commented 7 months ago

I also cannot cd to an artifact_dir but I'm not sure that's the issue. By default we get RX read and execute access which almost always seems to work fine. If I remove RX by resetting the ACLs manually we get exactly the issue seen here.

using Pkg

# Using isoband since it has only one tiny DLL
Pkg.add(name="isoband_jll", version="0.2.3")
using isoband_jll: libisoband_path  # loads the library

stat(libisoband_path)  # mode: 0o100444 (-r--r--r--)
Sys.isexecutable(libisoband_path)  # true

# RX - Read and execute access
run(`icacls $libisoband_path`)  # Everyone:(RX,WA)

# Reset the ACL to lose the RX
run(`icacls $libisoband_path /reset`)
run(`icacls $libisoband_path`)  # Everyone:(I)(R,W,D,DC)

stat(libisoband_path)  # mode: 0o100444 (-r--r--r--)
Sys.isexecutable(libisoband_path)  # false

# <restart julia>
using isoband_jll  # Access is denied.

@giordano already linked to https://github.com/JuliaLang/Pkg.jl/pull/3349 above. If I call Pkg.set_readonly(libisoband_path) this retains the executable bits as desired, but perhaps it sometimes doesn't?

staticfloat commented 7 months ago

Can someone icacls a directory that is not working?

visr commented 7 months ago

I copied an artifact_dir that wasn't working for experimentation, not realizing that this by default does not copy NTFS permissions, so I destroyed the evidence. Will be on the lookout.

disberd commented 7 months ago

Maybe it isn't executable? If that's the case and also the library isn't executable, that'd explain the issue

I have to say that in the case of Kaleido, running directly the executable using its full path works without access denied.

The problem there is that the kaleido.exe binary calls some dependencies with relative paths assuming that the program was called from the artifacts directory. That is why the included script (kaleido.cmd) to be called tries to set the working directory to the artifact dir (and fails because of access denied).

visr commented 7 months ago

Kaleido seems to be a special case with a kaleido.cmd in the artifact dir root, that is why I think it is unrelated to this issue.

This is how you typically run a JLL ExecutableProduct:

using Kaleido_jll
run(`$(kaleido()) $arguments`)

This uses bin/kaleido.exe rather than ./kaleido.cmd.

disberd commented 7 months ago

Yeah I agree with yout @visr that Kaleido is weird and I also wish the library itself was built in a more standard way without requiring the current directory to be the artifact dir to work :(.

Trying to call kaleido the standard way you suggested throws an error due to this relative path issue:

julia> using Kaleido_jll

julia> run(`$(kaleido()) plotly`)
[0229/082015.195:ERROR:registration_protocol_win.cc(131)] TransactNamedPipe: The pipe has been ended. (0x6D)
[0229/082015.217:WARNING:resource_bundle.cc(435)] locale_file_path.empty() for locale
{"code": 0, "message": "Success", "result": null, "version": ""}
[0229/082015.642:ERROR:registration_protocol_win.cc(103)] CreateFile: The system cannot find the file specified. (0x2)
[0229/082016.956:ERROR:kaleido.cc(155)] Failed to find, or open, local file at ./js/kaleido_scopes.js with working directory C:\Users\Alberto.Mengali\Repos\github\others\PlotlyKaleido

I agree that kaleido specifically might be unrelated to the issue at hand, I found it weird that I can not cd in a folder anymore and the kaleido library not working anymore seems to be caused by a change in artifacts permissions defaults in Julia 1.10. It is a pity for kaleido specifically because that is the only way to export plots for any of the plotly based libraries :(.

disberd commented 7 months ago

It would seem that the issue with Kaleido specifically can be solved simply by explicitly setting the directory to call the program for using the related field of the Cmd command:

cmd = Cmd(`$(kaleido()) plotly`; dir = Kaleido_jll.artifact_dir)

This fixed the tests failing on https://github.com/JuliaPlots/PlotlyKaleido.jl/pull/17 even with the read-only permissions of 1.10 on windows (Interesting though that explicitly doing cd throws an error while passing the directory to Cmd does not)

visr commented 7 months ago

Can someone icacls a directory that is not working?

Caught a few today on our CI servers. Here is one for libxml2 just like in the top post:

.\bin\libxml2-2.dll BUILTIN\Administrators:(R,WA)
                    DIRECTORY\Domain Users:(R,WA)
                    Everyone:(R,WA)
                    DIRECTORY\svc-teamcity-ansible:(R,WA)
                    NT AUTHORITY\SYSTEM:(F)

So indeed R read-only access, not RX read and execute access. I ran icacls on the whole directory with:

icacls aff35ec37f0361b8cfd28284a677e135a00d5f81 /t > icacls.txt

Results here: icacls.txt. So it looks like all files get (R,WA) and all directories (R,W,D,DC).

We worked around the issue with a manual icacls .julia\artifacts /reset /t.

ig-or commented 6 months ago

read only access for QT5 DLL files on Windows, not read-and-execute. So those libraries cannot be opened (by QWT/Marble/MathGL/qwtw_jll/QWTWPlot in my case)

WardBrian commented 6 months ago

I am seeing this crop up in a package I maintain where we actually do want write access to our artifact (as an aside, is this very bad practice on our part? I'm not that familiar with Julia packaging practices, the feature was contributed by someone else originally)

giordano commented 6 months ago

as an aside, is this very bad practice on our part?

Yes, these artifacts are content-addressable storage, if you want to edit them the content changes and they lose their purpose. Consider using https://github.com/JuliaPackaging/Scratch.jl instead (depending on exactly what you want to do, but this is going off topic, consider asking for further help in https://discourse.julialang.org/)

StefanKarpinski commented 5 months ago

I think we can ignore the bit about wanting to modify files in artifacts—we don't support that and don't want to. But what about the problems with loading libraries? Is the issue that the dll files need to have executable permissions in order to be loaded and lack them or that the directories need to have executable permissions and lack them? Is it clear to anyone what the actual problem here is?

visr commented 5 months ago

I think the example in https://github.com/JuliaLang/julia/issues/52272#issuecomment-1969910902 shows that the issue is that the DLL files need to have executable permissions in order to be loaded, and for unknown reasons sporadically end up lacking them. https://github.com/JuliaLang/julia/issues/52272#issuecomment-1976855327 shows that when this permission change happens, the permissions for the entire artifact are modified, not just the DLL.

I'm not sure, but I suspect the permission change only ever happens when installing an artifact, not when loading an installed artifact. I tried to reproduce with removing and installing small artifacts in a loop, but couldn't trigger it.

davidanthoff commented 5 months ago

I still maintain that Pkg.jl should never modify permissions on any files that it puts into a julia depot, permissions should always just be inherited. Instead it should modify the read-only attribute and leave the permissions entirely alone.

The thing is that once you modify any permission on a file, it no longer inherits any permissions from parent folders, and I'm sure that screws things up here. Just generally not a good idea to mess with permissions inside a user profile on Windows...

The set_readonly function here is just not what should be done on Windows by the package manager... On Windows that should just modify the file attribute and do nothing else.

davidanthoff commented 5 months ago

Related https://github.com/JuliaLang/Pkg.jl/issues/2677.

StefanKarpinski commented 5 months ago

But we have to mess with permissions to some extent because some files need to be executable while some files need to be not executable. We don't have to do set_readonly; it could just not be done on Windows, but it's good to ensure that packages and artifacts don't get accidentally modified and error instead if someone tries that. So what is the correct way to create files with various execute/read/write permissions on Windows?

giordano commented 5 months ago

But we have to mess with permissions to some extent because some files need to be executable while some files need to be not executable.

For what is worth, we already automatically set the executable bit in BinaryBuilder for shared libraries, so Pkg doesn't have to change anything. And this worked until Julia v1.9

StefanKarpinski commented 5 months ago

However BinaryBuilder may set the permissions when constructing an artifact tarball, it's Tar that has to extract them on the client system. When installing a tarball, Tar has to set the executable bit. What I'm point out is that "never modify permissions on any files that it puts into a julia depot, permissions should always just be inherited" is not helpful guidance. If the more specific guidance is "don't try to make things read only on Windows", fine, we can change that, but that's a different thing. The whole Windows permissions thing is so nuts. I just want someone who actually understands Windows to tell us what the right way to do this is and then we can do it. But "don't ever change permissions" is not a workable answer.

davidanthoff commented 5 months ago

But we have to mess with permissions to some extent because some files need to be executable while some files need to be not executable.

No, you don't :) The whole idea that you change executable file permissions on individual files is a Unix thing, that is not how it is done on Windows. The entire user profile folder on Windows has the executable permission for the user that owns it, all every user-profile installer does (including all of the MS ones) is just put whatever things they install into some subfolder of the user profile, don't change any permissions and then the right permissions are simply inherited. For system installs, no installer ever changes individual file permissions, the various Program Files folder all have the executable permission for all users on a system, so that again an installer just puts everything into a sub folder and keeps permissions inherited for everything.

We don't have to do set_readonly; it could just not be done on Windows, but it's good to ensure that packages and artifacts don't get accidentally modified and error instead if someone tries that.

The right way to handle this is to use the read-only attribute, and not use permissions for that. Those are distinct on Windows.

When installing a tarball, Tar has to set the executable bit.

No, that is not how it should be done on Windows. Heck, the tar that ships with Windows doesn't do that.

If you need the info whether that flag was set on disc somehow, so that you can for example compute a checksum or something like that, then the right way to handle that is to use the same approach that MS used for WSL, which I pointed out at https://github.com/JuliaLang/Pkg.jl/issues/2677.

But "don't ever change permissions" is not a workable answer.

That is how every other installer and package manager on Windows does it, that is the right answer. As far as I can tell there is absolute nothing that would be problematic/not work if we just followed that route.

StefanKarpinski commented 5 months ago

So just everything is either executable or not executable on Windows? Wild. And there's a whole other thing from permissions called attributes? Are they, like, enforced? Or just like a sticky note that says "please don't edit"?

ig-or commented 5 months ago

While making DLL artifacts for windows 11 (for some local registry), I have to specially (1) add "Everyone" user and (2) set read&execute flag for this user. Then copy all the DLLs to a linux server and make a tarball. Without steps (1) and (2), there were issues (sometimes!) when other people tried to use those artifacts. Issues like "no permission to open/run file".

davidanthoff commented 5 months ago

So just everything is either executable or not executable on Windows?

I think the right way to think about it is that permissions in Windows really are only used to isolate/protect/control how different users access things, not to signal to the system what kind of thing a certain file is. So, one would never use them along the Linux way of doing things, where a user might set the execution permission on a file they own to signal to the system "this is an executable file". The Windows logic is that if a user already owns the file, removing the executable permission is kind of pointless, after all that user could just add it again. I think the best way to think about this is that the mechanisms that signals to the system whether a file is an executable or not is the file extension.

So, the net effect of this is that on Windows typically permissions are set on some folder close to the root, exactly at the point where different users end up getting different permissions. So, for user profiles, they don't inherit permissions, but each user profile folder gets a set of permissions, and then everything below that inherits that, because that is the boundary of what a given user controls. And with user profiles in particular, for a given user they just have all the permissions for that folder, and anything within it. They are after all the owner of that folder in any case and can modify all permissions of things inside.

For software setup scenarios I think the lesson really is just: leave permissions alone, instead make sure you install into the right place where things belong and then the correct permissions will be inherited.

For Julia there are really just two scenarios: if someone installs into their user profile, then definitely just not doing anything about permissions is the right way to go. If someone wants to put a depot into some shared location where multiple users can access it, then that admin should set the correct permissions of the top level shared folder (namely read and executable) and then everything in that top level folder (which might be the depot itself) would inherit those permissions. Especially for this shared scenario, starting to set individual file level permissions is just going to lead to lots of problems: what if the admin later wants to give another user access to that shared location? Now they have two options: they can change the permissions on the parent folder and not replace all the child permissions. In that case everything will be broken because Julia turned off inheritance on some child files, and so they will now not get access permissions for this new user. Or the admin decides to overwrite all permissions of the children, in which case the entire thing Julia did with granular permissions will just be gone... Neither of those options lines up in any way with how things are done on Windows, what an admin would expect is that Julia leaves permissions entirely alone, and they set permissions on the root shared folder in whatever way they want, and then everything below inherits those.

Now, one could use the Windows permission system in a very different way, it is after all an incredibly flexible system. But no one does, not Microsoft nor any other software I'm aware off....

And there's a whole other thing from permissions called attributes?

Yes

Are they, like, enforced? Or just like a sticky note that says "please don't edit"?

Not entirely sure, but editors etc all seem to work well with them.

StefanKarpinski commented 5 months ago

Thank you for writing that, @davidanthoff, this is the first time Windows permissions have made any lick of sense to me. So it seems like this the correct approach:

Question about attributes: are they standard and structured? It sounds like the read-only one is. Is the executable one similar or are we making up our own metadata there that only we understand?

Immediate action to fix this on 1.10 and 1.11:

Medium term actions for 1.12:

Longer term potential actions:

The longer term actions may end up superceding the medium term actions if they're done soon enough.

davidanthoff commented 5 months ago

Question about attributes: are they standard and structured? It sounds like the read-only one is.

Yes, read-only is. SetFileAttributes (or one of its permutations) is the way to modify those, and the relevant constant there is FILE_ATTRIBUTE_READONLY.

Is the executable one similar or are we making up our own metadata there that only we understand?

So, Windows itself doesn't have that, but when MS introduced the Windows Subsystem for Linux (WSL) they faced the same problem and they ended up standardizing a set of extended file attributes for Linux metadata on files for WSL. So, when a user for example extracts a tar file from within WSL into a mounted Windows file system, then WSL will write all the metadata as extended attributes. That metadata does nothing in a Windows environment, but it is there and present for the Linux environment in WSL. I think we should just use the same strategy and store all this metadata in exactly the same way. It wouldn't functionally do anything in the Windows environment, other than allow us to compute things like proper checksums of the files on disc that take Unix file permissions into account. The MS docs are at https://learn.microsoft.com/en-us/windows/wsl/file-permissions#wsl-metadata-on-windows-files, and I opened an issue three years ago recommending this appreach :) https://github.com/JuliaLang/Pkg.jl/issues/2677

Now, if we do use these WSL extended attributes to store stuff, I guess we might as well use all of them and store all the metadata that is in the tar ball, right? Not just the executable bit. Again, it won't do anything, but it does allow us to store this metadata on the extracted files.

Immediate action to fix this on 1.10 and 1.11:

  • Just stop trying to make things read-only on Windows, i.e. make Pkg.set_readonly a no-op on Windows or just don't call it.
  • Anything else? Is that sufficient to fix this problem?

My best guess is that should do it. But of course this strategy only works if the package manager does not need to compute checksums of things on disc to compare with things in the tar files that depend on this kind of metadata, is that so? Also, there will be lots of user systems out there that have existing artifacts installed with all these individual permissions, but trying to fix that is probably too much?

StefanKarpinski commented 5 months ago

Now, if we do use these WSL extended attributes to store stuff, I guess we might as well use all of them and store all the metadata that is in the tar ball, right? Not just the executable bit.

The executable bit is actually the only thing that Tar.extract sets—we just ignore everything else. The classic tar programs try to set the user ID, group ID, exact permission modes, etc. but that's because they're archival tools for backing up and restoring systems. Whether a file is executable by its owner is the only bit of metadata that git cares about and we follow suit. The philosophy is spelled out in the README and the code is here:

https://github.com/JuliaIO/Tar.jl/blob/152d12e30441876c2aa61ab1aa57e7f1fc2b78b1/src/extract.jl#L108-L121

The one complication there is that we don't have a way to correctly allow the umask to moderate the group and world executable bits, so we just copy the read bits for those when the file is executable.

StefanKarpinski commented 5 months ago

My best guess is that should do it. But of course this strategy only works if the package manager does not need to compute checksums of things on disc to compare with things in the tar files that depend on this kind of metadata, is that so?

Tar.extract will still set executable permission, which it was previously doing, which should allow checksumming to continue working. This is not the behavior we want here, but it should fix the immediate problem.

Also, there will be lots of user systems out there that have existing artifacts installed with all these individual permissions, but trying to fix that is probably too much?

I definitely don't think we should do that for now, but I think we could later introduce a tool that scans all installed packages and artifacts and tries to fix it them up. We could build that into Pkg.

davidanthoff commented 5 months ago

Just for reference later, I think icacls C:\Users\USERNAME\.julia /q /c /t /reset would just crawl an entire Julia depot, remove all custom permissions and reset everything to inherit all permissions from the parent object. So maybe a really easy way to add support to package for cleaning all of this up would just be a command that essentially runs that command. Seems way easier than trying to figure things out on a per file basis ourselves...

StefanKarpinski commented 5 months ago

I think what we'd want that tool to do is more subtle:

This would leave all packages and artifacts correctly installed with attributes set and permissions not set.

davidanthoff commented 5 months ago

Ah, yes... That is painful, but probably the right way to do it...

StefanKarpinski commented 5 months ago

It's not too bad. We've been meaning to have a tool for checking and fixing installs anyway across platforms, this would just be the Windows logic.