Closed cbourjau closed 3 years ago
Hi @cbourjau, thanks for reporting the problem. For HTTP requests for big files, we are proceeding via EOSPUBLICHTTP gateway, and I see that the byte range requests are not supported there:
$ curl -v -k -r 0-10 https://eospublichttp.cern.ch//eos/opendata/cms/luminosity/2010/2010lumi.txt
...
< HTTP/1.1 416 REQUESTED_RANGE_NOT_SATISFIABLE
This means that the whole file is streamed.
We'll investigate whether it could be due to EOSPUBLIC proxying changes, or we simply look into alternative techniques.
P.S. The opendata-dev.cern.ch
is currently being used for testing the forthcoming upgrade of the underlying digital repository software infrastructure behind the CERN Open Data portal, so you may have seen some changes recently. Whist you could use opendata-qa.cern.ch
instead, which runs the old code base equal to production, I think that both DEV and QA servers are behaving the same currently as far as the byte range streaming is concerned, so the troubles you have observed are probably not related to DEV vs QA vs PROD setup, but rather common to all.
Thanks for the quick feedback! Indeed, I see the same behavior at opendata.cern.ch and opendata-qa.cern.ch
We'll investigate whether it could be due to EOSPUBLIC proxying changes
Indeed it is, see the discussion on Mattermost. A new XRootD release and a deployment on EOSPUBLIC disk servers is necessary to fix this. Let's check afterwards!
@tiborsimko Is there a tracking issue for that new release?
I see that XRootD 5.1 was released just yesterday https://xrootd.slac.stanford.edu/2021/02/23/announcement_5_1_0.html so hopefully this will unblock things? We can ask the EOSPUBLIC team in a week or so...
Unfortunately, things still seem to be broken, @tiborsimko. Are there any news?
Yes, the deployment happened yesterday, but it looks like another patch is necessary, see the update on Mattermost...
Thanks for the update! Unfortunately, I do not have access to Mattermost since I'm not a CERN user anymore.
By the way, not knowing the details I wonder if working on this issue might make fixing #2811 a drive by?
@tiborsimko are there any news on this regression? Unfortunately, I don't have access to the above linked Mattermost.
@tiborsimko Sorry for being pushy! Are there updates on this issue, yet? It does tank large parts of the utility of the https://github.com/cbourjau/alice-rs project.
Hi @cbourjau, the EOS team has an EOSPUBLIC intervention scheduled to happen this Thursday April 15th at 07:00 CEST, which should fix also the remaining HTTP range request problems too, so hopefully everything will be solved in a few days. Let's see!
Hi @cbourjau, just a quick heads-up that I've got an update that the intervention was postponed by a week to Wednesday April 21st at 07:00 CEST.
Thanks for the update, @tiborsimko !
@tiborsimko The intervention seems to have helped partially. Some requests are working as expected, while others still return 0 bytes without any indication of an error. This can be seen by running
curl -r 0-10 --no-progress-meter "http://opendata.web.cern.ch/eos/opendata/alice/2010/LHC10h/000139038/ESD/0001/AliESDs.root" | wc -c
a few times in a row. The printed number indicates the number of bytes downloaded: Sometimes 0 sometimes 11. Is it possible that the update did not reach every node, yet? Do you have some insight into what is going on?
It seems like this issue is fixed now. Thanks, @tiborsimko !
The title says it all. Doing
does not work as expected. I.e. 0 bytes are downloaded instead of the specified first 10. I am pretty sure that this used to work since the CI pipeline of alice-rs failed due to broken partial downloads only a few days ago.
Looking at the headers, the server specifies that it would accept range requests. Would be great to have this fixed since partial downloads are crucial to the above project.