opensciencegrid / StashCache

https://opensciencegrid.org/docs/data/stashcache/overview/
Apache License 2.0
1 stars 16 forks source link

Change stashcp to read caches list from wlcg-wpad servers by default #108

Closed DrDaveD closed 4 years ago

DrDaveD commented 4 years ago

See discussion in SOFTWARE-3516.

caches.json files can still be used with stashcp -j option, but if it is not set then the list of caches is read from the wlcg-wpad server and signature verified with the OSG cvmfs public key. Adds a new option -n to use an alternate provided list; currently the only option for that is '-n xroots'.

DrDaveD commented 4 years ago

@djw8605 @efajardo @rynge please review.

DrDaveD commented 4 years ago

I think a lot of this logic could be made more clear if there was a new class that handled all of the interaction with the special formatted file. Would have functions to verify the authenticity, handle getting the right lines...

There aren't currently any classes in the command and I don't think it's a good idea to change the style that much for a small subset. Some of this could probably be put into functions that are grouped closer together in the source file, would that do?

DrDaveD commented 4 years ago

@djw8605 This has stagnated. Can we proceed?

How about @brianhlin, do you have any comments?

djw8605 commented 4 years ago

How do we update the list of caches?

DrDaveD commented 4 years ago

By changing the list in CVMFS_EXTERNAL_URL in the master branch of the config-repo repository. The OSG cvmfs config repo is the osg branch and the EGI cvmfs config repo is the egi branch. The master branch is the superset of the two of them.

matyasselmeci commented 4 years ago

If I read the code correctly, it still supports admin overrides via a caches.json, right? Can we keep the old caches.json as an example? Rename it to caches.json.example and stick it under the docs directory?

efajardo commented 4 years ago

@matyasselmeci yes I support that and I still need that feature so I can use it for folks that do not have their caches in CVMFS, like CMS.

DrDaveD commented 4 years ago

I put it in docs/configs. I also removed the longitude & latitude information because they were being ignored.

DrDaveD commented 4 years ago

Ok, is anybody ready to approve this? Can it proceed?

DrDaveD commented 4 years ago

This has been sitting here long enough, somebody please approve.

brianhlin commented 4 years ago

At this point, these changes are over my head. @djw8605 @efajardo please look these changes over again and approve if appropriate.

DrDaveD commented 4 years ago

I moved json loading and stashserver loading into functions as requested by @djw8605

djw8605 commented 4 years ago

Another quick question. When it requests the list of stashcache servers in that file, is it in the ordered list for nearest?

DrDaveD commented 4 years ago

The first line of the stashservers.dat api response is the dynamically calculated order of servers, and the rest is the static, signed contents which includes all the servers. stashcp then applies the order and rearranges them nearest to farthest. It is done this way because the server names are sensitive; we don't want a man-in-the-middle attack. Once we have the names, the ordering is not sensitive information so it does not need to be verified. By putting them together in one response, we have minimal communications overhead.