ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.09k stars 2.56k forks source link

zig-cache gets into non-recoverable state during "zig build" #10865

Closed marler8997 closed 2 years ago

marler8997 commented 2 years ago

Zig Version

0.10.0-dev.620+9981b3fd2

Steps to Reproduce

It's unclear how the zig-cache got into this state. Possibly by running "zig build" may times with changes to the project or cleaning the local-cache between runs on an old slow machine.

Expected Behavior

zig build not to get an error trying to run the build runner executable.

Actual Behavior

zig failed to execute the build runner executable because it doesn't exist, but it thinks it does because it has an entry in the cache along with a "build.pdb" and "build.obj" file (this is on Windows) but there is no "build.exe" file like there should be.

Insight

I'm not sure if the build runner works differently, but I would have expected Zig to build the build runner into a tmp directory first, then at the last step rename that directory into zig-cache\o. This way if Zig fails/crashes while trying to populate that directory, then we won't get into an invalid state like this. If Zig is already doing this for the build runner then we'll have to investigate how this happened further.

joukoltmk commented 2 years ago

I ran into a similar (same?) issue while experimenting with hot reloading. It seems to be a zig build-cache problem specifically on windows (I am only on a windows machine so I can't comment on the behavior in other OS)

Zig version

zig-windows-x86_64-0.10.0-dev.290+3901b6fb0

Expected Behavior

Run "zig build run" without issues

Actual Behavior

If executable/library with exactly the same contents/hash is created then command fails with "error: AccessDenied".

Steps to Reproduce

Run "zig init-exe". Run "zig build run". Then modify the code in some way and build. Then reverse modifications and build it in its original state. You will now get the error.

Steps to fix?

From the process monitor one can observe that the build failed with SetRenameInformationFile with ReplaceIfExists set to true. From the documentation:

Even if ReplaceIfExists is set to TRUE, the rename operation will still fail if a file with the same name already exists and is a directory, a read-only file, or a currently executing file

So the issue here is most likely that if you create a build artifact with the same hash/content then on windows cache directory renaming fails.

marler8997 commented 2 years ago

@joukoltmk you ran into this issue: https://github.com/ziglang/zig/issues/8362

Very similar to this one but I think might have a different root cause.

joukoltmk commented 2 years ago

Ah yes, thanks. That issue seems to have the same problem I have.

andrewrk commented 2 years ago

Please try to add more clues to this issue if you find them - as is I'm not sure I would be able to work on this bug without anything more to go on.

nektro commented 2 years ago

marler's issue could be related to #6452, but no log was posted so unsure

nektro commented 2 years ago

but I would have expected Zig to build the build runner into a tmp directory first, then at the last step rename that directory into zig-cache\o

not possible due to #6364 still being open

marler8997 commented 2 years ago

I have been doing close to zero development on Windows in the last few months so I haven't seen this. Since it's unclear how to reproduce this right now and I haven't been able to see this I can't provide any more details. Given this, there's nothing actionable until someone else sees this again and is able to provide more details or a way to reproduce so I'll close it until we get more information.