xrootd / xrootd

The XRootD central repository https://my.cdash.org/index.php?project=XRootD
http://xrootd.org
Other
151 stars 151 forks source link

xrdcl.unzip broken from v5.5.0 #1876

Closed rodwalker closed 1 year ago

rodwalker commented 1 year ago

Hi, The pulling of files from a zip archive is not working in the current release, e.g.

xrdcp -f root://xrootd-atlas.cr.cnaf.infn.it:1094//atlas/atlasdatadisk/rucio/data15_13TeV/37/7f/RAW.21051586._000001.zip.1?xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1 . [0B/0B][100%][==================================================][0B/s]
Run: [ERROR] Received corrupted data: ZIP Central Directory corrupted. (source)

but it works for <= v5.4.3.

$ xrdcp -f root://xrootd-atlas.cr.cnaf.infn.it:1094//atlas/atlasdatadisk/rucio/data15_13TeV/37/7f/RAW.21051586._000001.zip.1?xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1 . [6.349MB/6.349MB][100%][==================================================][3.174MB/s]

This prevents us from using the archives, so need to revert to staging the many constituents.

Cheers, Rod.

rodwalker commented 1 year ago

Maybe it is obvious but I forgot to mention this is nothing to do with the server-side. If I copy the zip locally then it fails in the same way.

$ xrdcp RAW.21051586._000001.zip.1?xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1 . [0B/0B][100%][==================================================][0B/s]
Run: [ERROR] Received corrupted data: ZIP Central Directory corrupted. (source)

adriansev commented 1 year ago

Hi @rodwalker ! I (we, ALICE) use extensively the archives for log files (and for data but with compression 0), and so far i did not encountered any problem, including up to 5.5.2-rc1 Given that i'm also very interested to not have problems with this, maybe you can post here for developers a dump log? (obtained with export XRD_LOGLEVEL='Dump' XRD_LOGFILE=xrdlog.txt)? Also just to be certain, could you download the archive and do a zip -T archive.zip to be certain that the archive is valid? Thanks a lot!

rodwalker commented 1 year ago

Hi,

Since it works with v5.4.3 I think/thought that rules out a problem with the zip file. $ zip -T RAW.21051586._000001.zip.1 test of RAW.21051586._000001.zip.1 OK

I don`t see a useful difference in the log. The offset and size are the same, but the open fails

[0x23a7250@file://localhost/tmp/walkerr/RAW.21051586._000001.zip.1?xrdcl.requuid=4da9d0dc-4474-4d8f-8ff7-97c4126c7159&xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1] Got state response for message kXR_read (handle: 0x09000000, offset: 11824956046, size: 1050) [0x23a70e0] CD records parsed. [0x23a70e0] Failed to open a ZIP archive (file://localhost/tmp/walkerr/RAW.21051586._000001.zip.1?xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1): [ERROR] Received corrupted data

cf. a good one

[0x92baf0@file://localhost/tmp/walkerr/RAW.21051586._000001.zip.1?xrdcl.requuid=e3210b03-9b75-4446-9dd7-fcd5dc3fef8e&xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1] Got state response for message kXR_read (handle: 0x0c000000, offset: 11824956046, size: 1050) [0x92b9a0] CD records parsed. [0x92b9a0] Opened a ZIP archive (file://localhost/tmp/walkerr/RAW.21051586._000001.zip.1?xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1): [SUCCESS]

I unzip to get the 9 files then zip with zip -0 RAW.21051586._000001.zip.zero data15* and now xrdcp works. The total file size is a little different $ ls -l /tmp/RAW.21051586._000001.zip.1 /tmp/RAW.21051586._000001.zip.zero -rw-r--r--. 1 walkerr zp 11824957194 10. Jan 17:40 /tmp/RAW.21051586._000001.zip.1 -rw-r--r--. 1 walkerr zp 11824957466 11. Jan 10:40 /tmp/RAW.21051586._000001.zip.zero but the 'unzip -v' output is identical. They should be readable on lxplus737 for a while.

To summarise, there maybe something odd with the zip file, but it works in zip/unzip and older xrdcp version.

Cheers, Rod.

simonmichal commented 1 year ago

@rodwalker : thanks for reporting this problem! Could you please make available the file you used for testing somewhere in EOS?

rodwalker commented 1 year ago

Hi, This ok?

$ xrdcp -f root://eosatlas.cern.ch:1094//eos/atlas/atlasscratchdisk/rucio/data15_13TeV/37/7f/RAW.21051586._000001.zip.1?xrdcl.unzip=data15_13TeV.00266904.debugrec_hlt.merge.RAW.g17._0001.1 . [0B/0B][100%][==================================================][0B/s]
Run: [ERROR] Received corrupted data: ZIP Central Directory corrupted. (source)

Cheers, Rod.

simonmichal commented 1 year ago

@rodwalker : yes, that's perfect, I am able now to reproduce the problem!

simonmichal commented 1 year ago

@rodwalker : it should be fixed in a0ce75a75b47c7d62cb81da00174edbb467a0faa, @rodwalker & @adriansev : could you guys please give it a try?

simonmichal commented 1 year ago

I think we can close this one!