Open KirillOsenkov opened 1 month ago
I think we shouldn't be caching failed restores at all
No-op should not be caching failed restores: https://github.com/NuGet/NuGet.Client/blob/dev/src/NuGet.Core/NuGet.Commands/RestoreCommand/RestoreCommand.cs#L806C21-L806C30, which leads to https://github.com/NuGet/NuGet.Client/blob/38ab39a6691e6e5d54e29055d9e258b8b5d57220/src/NuGet.Core/NuGet.ProjectModel/CacheFile.cs#L43, so I'm guessing there's probably something else going.
How did you conclude it was a no-op caching issue?
I debugged and saw that it considers everything up-to-date and doesn't attempt to rerun the restore
When you say noop should be caching failed restores, that's what I think is happening, but this is wrong behavior. If the restore failed, we should delete the cache.
When you say noop should be caching failed restores, that's what I think is happening, but this is wrong behavior. If the restore failed, we should delete the cache.
My bad, typo. It should not. The cache file has a status in it, which tells us not to cache it.
I debugged and saw that it considers everything up-to-date and doesn't attempt to rerun the restore
Might need help getting to this ourselves, since I'm wondering why the code I linked wasn't working.
Try injecting a network error (like 401 access denied to an azdo feed with a wrong PAT), then restore again. It will play back the 401 but won't even attempt to hit the network.
cc @zivkan in case anything rings a bell.
I'm wondering if there's something we're messing up on the audit front, since Audit is not supposed to fail restore by default (hence NU1900), but I don't think we're doing anything custom. it's originally a warning getting logged.
we may have WarningsAsErrors set
Audit and check for vulnerabilities does ring a bell, I think I saw something while debugging
And I did see the NU1900 warning being converted to error after it's been played back, so maybe the restore is considered successful because it doesn't know the warning will later be converted to an error by MSBuild? This would explain what we're seeing
If you have a restore that failed with NU1900: Error occurred while getting package vulnerability data: Unable to load the service index for source...
You can then make a change to the NuGet Credential Provider, add the proper token, and re-run restore, it will still fail with the same error NU1900.
This is because the cache file is valid (since nothing changed on disk), so it doesn't attempt to re-restore.
This was an absolute nightmare to investigate (because when NuGet prints 401 you still think the problem is with auth, but really it doesn't even attempt to re-auth because it just plays back the no-op restore).
We also have warnings as errors, so the warning is logged as an error and fails the build.