johnnychen94 / StorageMirrorServer.jl

As I want it be available, fast, complete and persistent
MIT License
7 stars 0 forks source link

If the latest registry in upstreams don't agree with each other, mirror all of them #8

Open johnnychen94 opened 4 years ago

johnnychen94 commented 4 years ago

It makes less sense to code the complex logic here as the storage server doesn't care much about which is the latest one; simply pulling down all registry tarballs and whatever packages they contain would be good enough.

https://github.com/johnnychen94/StorageMirrorServer.jl/blob/3bfd65e6f2a872df84e31da59c646d6940e31145/src/utils/server_utils.jl#L54-L89

skyzh commented 4 years ago

Is there any way to read which one is "latest" (e.g. from timestamp)?

johnnychen94 commented 4 years ago

Unfortunately, no, at least the current pkg&storage protocol doesn't specify this.

https://github.com/JuliaLang/Pkg.jl/issues/1377 One subtlety is how the Pkg Server determines what the latest version of each registry is. It can get a map from regsitry UUIDs to version hashes from each Storage Server, but hashes are unordered—if multiple Storages Servers reply with different hashes, which one should the Pkg Server use? When Storage Servers disagree on the latest hash of a registry, the Pkg Server should ask each Storage Server about the hashes that the other servers returned: if Service A knows about Service B's hash but B doesn't know about A's hash, then A's hash is more recent and should be used. If each server doesn't know about the other's hash, then neither hash is strictly newer than the other one and either could be used. The Pkg Server can break the tie any way it wants, e.g. randomly or by using the lexicographically earlier hash.

For storage server, mirroring all registry tarballs is equivalent to mirroring the "latest" tarball, and thus we could free us from the complex "which is the latest" code logic and to use a plain for loop to just mirror the same registry multiple times (most of them will be trivial cases).