JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
621 stars 260 forks source link

prevent package downgrade when pkg server is out-of-date #2215

Open felipenoris opened 3 years ago

felipenoris commented 3 years ago

I was having some conectivity issues with the registry today. When it finally worked, it downgraded a few packages. Most likely the mirror I'm connecting to is out-of-date. I would suggest to fail (or implement an update argument command to control behavior) instead of downgrade if the local registry copy is ahead of the mirror's copy

(@v1.5) pkg> up
   Updating registry at `~/.julia/registries/General`
######################################################################## 100.0%
No Changes to `~/.julia/environments/v1.5/Project.toml`
Updating `~/.julia/environments/v1.5/Manifest.toml`
  [69de0a69] ↓ Parsers v1.0.12 ⇒ v1.0.11
KristofferC commented 3 years ago

How about something like this: we send the current tree hash to the server, if it is newer than the tree hash the server would serve us, we don't update (downgrade) the local registry.

felipenoris commented 3 years ago

That would also have the benefit of preventing from downloading unnecessary data, like an ETag.

KristofferC commented 3 years ago

Cc @StefanKarpinski, @staticfloat, any opinions about this?

staticfloat commented 3 years ago

How do we know if a tree hash is newer?

felipenoris commented 3 years ago

Somewhere someone would need to know the history of the tree hashes, like what git does with commit hashes. The server could do it.

In the error case, maybe the client data is corrupted. An update command argument could control if we let the server update the client (potential downgrade) or not.

I don't know how it is done today, but it turns out that this setting could potentially reduce the pressure on your servers.

StefanKarpinski commented 3 years ago

That's way too much work for the server to be doing. Pkg servers need to be as simple and reliable as possible and this is far too complicated. They don't know anything about git and they don't keep git history around. All they do is serve content-addressed tarballs that they get from storage servers. The client could do it, but that's also a lot of work for the client and we haven't had great experiences in the past with clients doing these kinds of fiddly git history computations as part of the normal workflow. Alternatively, the client could send their current hash and if the server doesn't know about it, we could assume that it's newer and refuse to downgrade or ask the user what to do.

DilumAluthge commented 3 years ago

Alternatively, the client could send their current hash and if the server doesn't know about it, we could assume that it's newer and refuse to downgrade or ask the user what to do.

Probably this is the best approach.

StefanKarpinski commented 3 years ago

Actually, it doesn't really make sense for the server to determine this. The client already has the registry: if the current hash is not in the registry, the client can assume that the server doesn't know about it. Unclear what to do at that point, but if the installed copy has a project file with a version entry, then we can look at that and decide what to do based on that information.

staticfloat commented 3 years ago

I think we're talking about the registry itself here though.

StefanKarpinski commented 3 years ago

In that case we can either try doing a head request on the current hash first or send it along as a header.

felipenoris commented 3 years ago

Based on the ETag approach, it could be anything you want. You could generate a sequential number. Each number gets mapped to a hash. If the client gets the current number along with the hash tree in its local registry then the server would compare this number with the current to decide on sending a new copy of the registry to the client, consuming CPU on the server side to compare two integers instead of looking for a tree hash in a list. This could be layered on top of the current tree hash comparison approach, making it optional and easier to deploy to clients. As with most optimizations, usually you make things more complicated though.

But, hey, these are just ideas. Just preventing the downgrade thing would be awesome, even based on other approaches.

DilumAluthge commented 3 years ago

In that case we can either try doing a head request on the current hash first

I think this approach makes the most sense.

DilumAluthge commented 3 years ago
  1. Pkg client sends a head request on the current registry hash.
  2. If the Pkg server knows about the registry hash that the Pkg client currently has, then we proceed to update the registry as normal.
  3. If the Pkg server does not know about the registry hash that the Pkg client currently has, then we do NOT update the registry. We keep the registry copy that the Pkg client currently has, and we proceed with all other Pkg operations (except for registry updating) as normal.
DilumAluthge commented 3 years ago

@StefanKarpinski Regarding the other approach

or send it along as a header.

How would this work? The Pkg client sends its current registry hash as a header to the Pkg server. The Pkg server receives this and... does what with the information?

If the Pkg server knows about the registry hash that it received in the header, then it can proceed as normal.

But what should the Pkg server do if it does not know about the registry hash that it received in the header? Just kick back a 404 or something to the Pkg client?

DilumAluthge commented 3 years ago

I like the "current registry hash in a header" idea because it means we can implement this fix on the Pkg server side, without needing to patch Pkg clients. But I'm not sure what that implementation actually looks like.

EDIT: This is incorrect. The "current registry hash in a header" would require adding new code to both the Pkg client and the Pkg server.

StefanKarpinski commented 3 years ago

I think the HEAD approach is better. The only thing we could do for the other approach is that if the server doesn't know about the version in the header, it could return a bogus 404 or something, which doesn't seem ideal.

DilumAluthge commented 3 years ago

There is however one big advantage of using the header approach with the bogus 404: we can implement the fix in PkgServer.jl and deploy it as soon as we want, without needing to wait for Julia 1.6 to be available. Also, the fix would be available for all Julia versions that use the Pkg server, not just Julia 1.6.

DilumAluthge commented 3 years ago

~There is however one big advantage of using the header approach with the bogus 404: we can implement the fix in PkgServer.jl and deploy it as soon as we want, without needing to wait for Julia 1.6 to be available. Also, the fix would be available for all Julia versions that use the Pkg server, not just Julia 1.6.~

Whoops, I was entirely incorrect about this. If we do the header approach, we need to add code to both the Pkg client and the Pkg server. So the header approach still requires that we add some code to Pkg.jl.

DilumAluthge commented 3 years ago

@StefanKarpinski Here's a possible problem with the HEAD approach.

If I make a HEAD request to a registry hash that exists, I get a 200.

But, if I make a HEAD request to a registry hash that does NOT exist, it hangs indefinitely.

fredrikekre commented 3 years ago

I don't think this is something that should be fixed on the client side. This is also something you only notice if you switch Pkg servers, which, under normal conditions, you never do. Just because a Pkg Server doesn't know about my current hash doesn't mean my current hash is newer. In a perfect world (with functioning servers) I think that the server should be the truth. It might just as well be that the admin has rolled back something.

felipenoris commented 3 years ago

Unfortunately, it happens quite often, about once a week for my region. I would suggest to avoid some corner cases by considering a change in the registry something immutable: if you wanna rollback something, publish a new version instead of going back a version (like a git revert HEAD).

This is right now:

fnoro@dbecceed1666:~/tmp/docker-builder$ julia updeps.jl                                                                                    
   Updating registry at `~/.julia/registries/General`                                                                                           
######################################################################################################################################### 100.0%
Updating `~/tmp/docker-builder/external_deps/Project.toml`                                                                                  
  [a93c6f00] ↓ DataFrames v0.22.1 ⇒ v0.21.8                                                                                                     
  [31c24e10] ↓ Distributions v0.24.4 ⇒ v0.23.12                                                                                                 
  [e30172f5] ↓ Documenter v0.25.5 ⇒ v0.25.3                                                                                                     
  [0097028c] ↓ FinancialDSL v0.3.2 ⇒ v0.3.1                                                                                                     
  [cd3eb016] ↓ HTTP v0.9.0 ⇒ v0.8.19                                                                                                            
  [0f8b85d8] ↓ JSON3 v1.5.1 ⇒ v1.4.0                                                                                                            
  [9b8beb19] ↓ JSONWebTokens v0.3.3 ⇒ v0.3.2                                                                                                    
  [69de0a69] ↓ Parsers v1.0.12 ⇒ v1.0.11                                                                                                        
Updating `~/tmp/docker-builder/external_deps/Manifest.toml`                                                                                 
  [324d7699] ↓ CategoricalArrays v0.9.0 ⇒ v0.8.3                                                                                                
  [a8cc5b0e] - Crayons v4.0.4                                                                                                                   
  [a93c6f00] ↓ DataFrames v0.22.1 ⇒ v0.21.8                                                                                                     
  [b552c78f] ↓ DiffRules v1.0.2 ⇒ v1.0.1                                                                                                        
  [31c24e10] ↓ Distributions v0.24.4 ⇒ v0.23.12                                                                                                 
  [e30172f5] ↓ Documenter v0.25.5 ⇒ v0.25.3                                                                                                     
  [1a297f60] ↓ FillArrays v0.10.1 ⇒ v0.9.7                                                                                                      
  [0097028c] ↓ FinancialDSL v0.3.2 ⇒ v0.3.1                                                                                                     
  [59287772] - Formatting v0.4.1                                                                                                                
  [cd3eb016] ↓ HTTP v0.9.0 ⇒ v0.8.19                                                                                                            
  [0f8b85d8] ↓ JSON3 v1.5.1 ⇒ v1.4.0                                                                                                            
  [9b8beb19] ↓ JSONWebTokens v0.3.3 ⇒ v0.3.2                                                                                                    
  [77ba4419] ↓ NaNMath v0.3.5 ⇒ v0.3.4                                                                                                          
  [69de0a69] ↓ Parsers v1.0.12 ⇒ v1.0.11                                                                                                        
  [08abe8d2] - PrettyTables v0.10.1                                                                                                             
  [4c63d2b9] ↓ StatsFuns v0.9.6 ⇒ v0.9.5                                                                                                        
  [bd369af6] ↓ Tables v1.2.1 ⇒ v1.1.0                                                                                                           
  [5c2747f8] - URIs v1.0.1                                                                                                                      

ERROR: Unsatisfiable requirements detected for package FinancialDSL [0097028c]:                                                                 
 FinancialDSL [0097028c] log:

What's more strange is that it goes back up to 10 days of updates in the registry.

Yujie-W commented 3 years ago

I encountered the same issue here. If I remove default General registry and add General mannually, it works

(@v1.5) pkg> registry rm General
   Removing registry `General` from ~/.julia/registries/General
(@v1.5) pkg> registry add https://github.com/JuliaRegistries/General
    Cloning registry from "https://github.com/JuliaRegistries/General"
      Added registry `General` to `~/.julia/registries/General`

However, Github action uses the default registry from the server and fails...

Yujie-W commented 3 years ago

Bt the way, I have to run this command in the workflow to avoid such a problem with the server

julia -e 'using Pkg; Pkg.Registry.rm("General"); Pkg.Registry.add(RegistrySpec(url="https://github.com/JuliaRegistries/General"))'
fredrikekre commented 3 years ago

Unfortunately, it happens quite often, about once a week for my region.

What region is that? Are you sure the downgrading is because the newer versions are not in the registry you obtain and not e.g. compat bounds (or possibly https://github.com/JuliaLang/Pkg.jl/issues/1949)? For reference:

Installing from eu-central
  [a93c6f00] DataFrames v0.22.1
Installing from us-west
  [a93c6f00] DataFrames v0.22.1
Installing from us-east
  [a93c6f00] DataFrames v0.22.1
Installing from us-east2
  [a93c6f00] DataFrames v0.22.1
Installing from us-east-ci
  [a93c6f00] DataFrames v0.22.1
Installing from kr
  [a93c6f00] DataFrames v0.22.1
Installing from in
  [a93c6f00] DataFrames v0.22.1
Installing from au
  [a93c6f00] DataFrames v0.22.1
Installing from sg
  [a93c6f00] DataFrames v0.22.1

What's more strange is that it goes back up to 10 days of updates in the registry.

What do you mean with this?

StefanKarpinski commented 3 years ago

Registries (like all projects) should rollback by making revert commits, not by resetting master to an old commit. I do think the general solution here is to improve the up-to-dateness of the pkg servers, which is something that @staticfloat and I will get to as soon as we're done with 1.6-critical work. But there is still the potential for auto-choosing a pkg server to lead to an apparent backwards jump in time, which it might be best to avoid.

felipenoris commented 3 years ago

What region is that? Are you sure the downgrading is because the newer versions are not in the registry you obtain and not e.g. compat bounds (or possibly #1949)? For reference:

My region is Brazil. And when I posted I could reproduce the issue on at least two internet providers. But looks like pkg server was messed up based on the slack infrastructure channel chat.

If I remove the current registry and revert to the git protocol, like what @Yujie-W was talking about, everything works fine, so my project compat settings are good.

What's more strange is that it goes back up to 10 days of updates in the registry.

What do you mean with this?

By the version tags that I got downgraded to, they were tagged about 10 days ago.

fredrikekre commented 3 years ago

Registries (like all projects) should rollback by making revert commits, not by resetting master to an old commit.

Yea, but jokes on us, we identify by tree-hash.

StefanKarpinski commented 3 years ago

Here's a possibility. We want to switch to client-driven server selection so that pkg clients can choose the pkg server that's the fastest for them. The test could be a HEAD request for their current hashes registries and then the client chooses the that replies fastest with a 200 OK. That would prevent selection of a pkg server that's out of date until it catches up.

StefanKarpinski commented 3 years ago

Yea, but jokes on us, we identify by tree-hash.

So? Pkg servers should still be able to serve older registry trees and the only reason to not be able to is if you're out of sync and either have to go fetch the newer tree or don't have access to it at all.

fredrikekre commented 3 years ago

So? Pkg servers should still be able to serve older registry trees and the only reason to not be able to is if you're out of sync and either have to go fetch the newer tree or don't have access to it at all.

Sure, but we there is no notion of "newest"? Maybe I misunderstood the proposed solution above.

StefanKarpinski commented 3 years ago

The set of things that each pkg server knows about only grows, so if one server knows about a registry tree but another one doesn't, then the one that doesn't know about has an older view of the world than the other one.

staticfloat commented 3 years ago

The set of things that each pkg server knows about only grows, so if one server knows about a registry tree but another one doesn't, then the one that doesn't know about has an older view of the world than the other one.

This is true for storage servers, but not for Pkg servers. Pkg servers forget things all the time, and they need to ask the storage server whether something exists if that thing has fallen out of their cache. As it stands, the unreliability of our infrastructure makes this a harder problem than it should be, so I'm just working on getting this to be reliable again.

StefanKarpinski commented 3 years ago

I agree that before doing this sort of thing we should spend some time working on general reliability, which might just make it no longer a problem that needs to be solved.

ronisbr commented 3 years ago

Are you sure the downgrading is because the newer versions are not in the registry you obtain and not e.g. compat bounds (or possibly #1949)?

Just for the record, I live in the same country as @felipenoris (close to São Paulo), and I do occasionally see the same problems. Hence, it really does not seem related to a versioning problem of the packages. There was a week with so many downgrades that I decided to just dev what I needed, and started to manually update those packages.

fredrikekre commented 3 years ago

There was a week with so many downgrades

This problem has been identified and will be fixed.

ronisbr commented 3 years ago

Nice! Thanks for the info @fredrikekre !