cernopendata / opendata.cern.ch

Source code for the CERN Open Data portal
http://opendata.cern.ch/
GNU General Public License v2.0
665 stars 148 forks source link

HTTP requests with byte range seem to be broken #3074

Closed cbourjau closed 3 years ago

cbourjau commented 3 years ago

The title says it all. Doing

curl -H "range: bytes=0-10" http://opendata-dev.cern.ch/eos/opendata/alice/2010/LHC10h/000139038/ESD/0001/AliESDs.root --output testfile.root

does not work as expected. I.e. 0 bytes are downloaded instead of the specified first 10. I am pretty sure that this used to work since the CI pipeline of alice-rs failed due to broken partial downloads only a few days ago.

Looking at the headers, the server specifies that it would accept range requests. Would be great to have this fixed since partial downloads are crucial to the above project.

tiborsimko commented 3 years ago

Hi @cbourjau, thanks for reporting the problem. For HTTP requests for big files, we are proceeding via EOSPUBLICHTTP gateway, and I see that the byte range requests are not supported there:

$ curl -v -k -r 0-10 https://eospublichttp.cern.ch//eos/opendata/cms/luminosity/2010/2010lumi.txt
...
< HTTP/1.1 416 REQUESTED_RANGE_NOT_SATISFIABLE

This means that the whole file is streamed.

We'll investigate whether it could be due to EOSPUBLIC proxying changes, or we simply look into alternative techniques.

P.S. The opendata-dev.cern.ch is currently being used for testing the forthcoming upgrade of the underlying digital repository software infrastructure behind the CERN Open Data portal, so you may have seen some changes recently. Whist you could use opendata-qa.cern.ch instead, which runs the old code base equal to production, I think that both DEV and QA servers are behaving the same currently as far as the byte range streaming is concerned, so the troubles you have observed are probably not related to DEV vs QA vs PROD setup, but rather common to all.

cbourjau commented 3 years ago

Thanks for the quick feedback! Indeed, I see the same behavior at opendata.cern.ch and opendata-qa.cern.ch

tiborsimko commented 3 years ago

We'll investigate whether it could be due to EOSPUBLIC proxying changes

Indeed it is, see the discussion on Mattermost. A new XRootD release and a deployment on EOSPUBLIC disk servers is necessary to fix this. Let's check afterwards!

cbourjau commented 3 years ago

@tiborsimko Is there a tracking issue for that new release?

tiborsimko commented 3 years ago

I see that XRootD 5.1 was released just yesterday https://xrootd.slac.stanford.edu/2021/02/23/announcement_5_1_0.html so hopefully this will unblock things? We can ask the EOSPUBLIC team in a week or so...

cbourjau commented 3 years ago

Unfortunately, things still seem to be broken, @tiborsimko. Are there any news?

tiborsimko commented 3 years ago

Yes, the deployment happened yesterday, but it looks like another patch is necessary, see the update on Mattermost...

cbourjau commented 3 years ago

Thanks for the update! Unfortunately, I do not have access to Mattermost since I'm not a CERN user anymore.

By the way, not knowing the details I wonder if working on this issue might make fixing #2811 a drive by?

cbourjau commented 3 years ago

@tiborsimko are there any news on this regression? Unfortunately, I don't have access to the above linked Mattermost.

cbourjau commented 3 years ago

@tiborsimko Sorry for being pushy! Are there updates on this issue, yet? It does tank large parts of the utility of the https://github.com/cbourjau/alice-rs project.

tiborsimko commented 3 years ago

Hi @cbourjau, the EOS team has an EOSPUBLIC intervention scheduled to happen this Thursday April 15th at 07:00 CEST, which should fix also the remaining HTTP range request problems too, so hopefully everything will be solved in a few days. Let's see!

tiborsimko commented 3 years ago

Hi @cbourjau, just a quick heads-up that I've got an update that the intervention was postponed by a week to Wednesday April 21st at 07:00 CEST.

cbourjau commented 3 years ago

Thanks for the update, @tiborsimko !

cbourjau commented 3 years ago

@tiborsimko The intervention seems to have helped partially. Some requests are working as expected, while others still return 0 bytes without any indication of an error. This can be seen by running

curl -r 0-10 --no-progress-meter "http://opendata.web.cern.ch/eos/opendata/alice/2010/LHC10h/000139038/ESD/0001/AliESDs.root"  | wc -c

a few times in a row. The printed number indicates the number of bytes downloaded: Sometimes 0 sometimes 11. Is it possible that the update did not reach every node, yet? Do you have some insight into what is going on?

cbourjau commented 3 years ago

It seems like this issue is fixed now. Thanks, @tiborsimko !