CDLUC3 / dash

General repository for documents and communication for UC Dash project.
http://cdluc3.github.io/dash
MIT License
11 stars 4 forks source link

Problem with ingesting large objects #66

Closed cpwillett closed 9 years ago

cpwillett commented 9 years ago

I created a 5 GB object in dash-stg and submitted it to Merritt. The local ID is "rizuozyhun"; the object contains 10 files each 512 MB. The status is "Sending To Merritt (Processing)" but Merritt never received it. (This is puzzling.)

There seems to be a problem with the zip container that is created. Can the rubyzip gem create zip containers larger than 4 GB? Testing the zip container returns this error, but I'm not sure this is the problem. Can we tell if it was actually submitted to Merritt?

uc3-datashare-stg pwillett/test> unzip -t rizuozyhun.zip
Archive:  rizuozyhun.zip
warning [rizuozyhun.zip]:  4294967296 extra bytes at beginning or within zipfile
  (attempting to process anyway)
file #1:  bad zipfile offset (local header sig):  4294967296
 (attempting to re-compensate)
    testing: mrt-datacite.xml         OK
    testing: mrt-dc.xml               OK
    testing: mrt-dataone-manifest.txt   OK
    testing: 512MBTestObject01.blob   OK
    testing: 512MBTestObject02.blob   OK
    testing: 512MBTestObject03.blob   OK
    testing: 512MBTestObject04.blob   OK
    testing: 512MBTestObject05.blob   OK
    testing: 512MBTestObject06.blob   OK
    testing: 512MBTestObject07.blob   OK
    testing: 512MBTestObject08.blob   OK
file #12:  bad zipfile offset (local header sig):  1312108
  (attempting to re-compensate)
    testing: 512MBTestObject09.blob   OK
    testing: 512MBTestObject10.blob   OK
At least one error was detected in rizuozyhun.zip.
cpwillett commented 9 years ago

I submitted the same 5 GB object as above in dash-stage, and it went through fine. https://merritt-stage.cdlib.org/m/ark%3A%2F99999%2Ffk43203x1d https://dash-stg.ucop.edu/xtf/view?docId=ucop/ark%2B%3D99999%3Dfk43203x1d/mrt-datacite.xml