ganga-devs / ganga

Ganga is an easy-to-use frontend for job definition and management
GNU General Public License v3.0
100 stars 159 forks source link

accessURL method not doing the right thing for DiracFile #654

Closed egede closed 8 years ago

egede commented 8 years ago

The accessURL method for DiracFile seems just to return an empty string. It should rather return a URL that can be openeddirectly in ROOT or similar. See https://lhcbqa.web.cern.ch/lhcbqa/317/how-to-automatically-replicate-to-cern-user for a discussion of what a user would have to do now to get it to work. The example given there, just returns the URL for the first place the file is replicated. One might want to do something slightly cleverer.

rob-c commented 8 years ago

Does it make sense for the method to accept an argument which stears the filetype to know where to look for the file on multiple locations? I know it doesn't for other filetypes but I don't think it's too bad to have DiracFile overload the method and allow for a an optional argument to be passed. I think DiracFile is the unique one (for now) which only has 1 lfn to multiple PFN but how are we to know which SE we should access? I think it's sensible to use the defaultSE for the DiracFile to first order and then allow for it to be overloaded so people can use this for testing file access and such if required.

egede commented 8 years ago

@rob-c I think using defaultSE from the configuration as the default, and to allow an extra argument is fine. The important thing is that a URL should be returned even if it is not matching those (ie. the argument gives a preference but not a requirement). We have to teach users that it is perfectly fine to read files across the WAN.

rob-c commented 8 years ago

@egede OK, I hadn't appreciated that we are promoting/enabling accessing files remotely over the steaming protocols. This could ultimately allow Ganga to be used to construct some test jobs which can monitor/debug file transfer across DIRAC or to/from a given site.

egede commented 8 years ago

@rob-c It works really well and can through promotion avoid the often needless transfers from Dirac storage to EOS or to local disk. As can be seen from the original discussion (linked into the first post of this issue), it is a bit cumbersome at the moment to get right.

mesmith75 commented 8 years ago

I'm not sure this function in ganga ever worked. The requisite DiracCommand was missing. I have it approximately working now. I thought I would just get it to return a list of the SEs and the URL for each rather than picking one.

Before I commit anything can you explain the reason for the matching between the SE and the site? Is it really necessary? If not I would like to get rid of that part of the code. I am referring to about lines 517 to 531. https://github.com/ganga-devs/ganga/blob/develop/python/GangaDirac/Lib/Files/DiracFile.py#L517

As an aside the defaultSE element is returning nothing - is that to be expected?

rob-c commented 8 years ago

@mesmith75 Reading the code I see what it was trying to do but I think this fell by the wayside as the (LHCb)Dirac interface changed a while ago.

I would think it better to return a list of URL with the one corresponding to the defaultSE at the top. The problem is the definition of the defaultSE it is used in different contexts to use different things. Ideally I think we want this to point to the localSE wherever that is but we opted to change the defaultSE directly in the DiracFile config. Normally we just guess randomly what SE we want to use. Here we want to default to CERN or equivalent (if available) so I think just a case of sort by the SE and return a flat list to be consistent with other interfaces (i.e. MassStorageFile)

rob-c commented 8 years ago

OK, at this point the best we can do is to get a randomSE from the list of SE where an LFN has been replicated. Maybe we can make the random selection a bit more biased by weighting sites with more disks more heavily than sites with fewer disks.

I think we want this to allow for accessing a grid file on the Batch system at CERN for instance but there is no nice way of requesting the replicas at CERN first unless we add a new configuration parameter. Either way if the file doesn't exist at the site you want you should then get the cross-site address to access the file. Can Gaudi jobs take multiple addresses for a file? In which case the selection of where it gets it from can be shifted onto them rather than something we need to worry about.

After speaking to the DIRAC experts at Imperial there is no nice API to get the file which is closest to the current machine that we know of. (I'm keen to stick to Vanilla DIRAC solutions as much as possible in DiracFile)

mesmith75 commented 8 years ago

Implemented the function now, pull request #684 There is a new DiracCommand for getting the accessURL and the function in python/GangaDirac/Lib/Files/DiracFile.py has been fixed.

mesmith75 commented 8 years ago

Fixed by #684