UCSC-MedBook / MedBook-JobRunner

Runs and monitors Jobs (Current UNIX processes, Galaxy and other Environments coming)
2 stars 1 forks source link

Need to detect & recover from blob upload failure when parsing wrangled files #3

Open e-t-k opened 8 years ago

e-t-k commented 8 years ago

Note: this involves job-runner; wrangler, and wrangler-collections

Errors were observed trying to upload a submission in wrangler & parse. Ultimate cause determined to be due to the submission's blob not having uploaded correctly. Failed upload needs to be detected & recovered from. (How?

Observed errors:

Error in client (/wrangler/editSubmission/ id ): Internal error encountered while parsing file

Error in console:

 job: rejected -  [TypeError: Object.keys called on non-object]
 stack trace: TypeError: Object.keys called on non-object
     at Function.keys (native)
     at RectangularGeneExpression.endOfFile (packages/medbook:wrangler-collections/fileHandlers/RectangularGeneExpression.js:118:1)
     at packages/medbook:wrangler-collections/fileHandlers/TabSeperatedFile.js:86:1
     at runWithEnvironment (packages/meteor/dynamics_nodejs.js:108:1)

Proximate cause (for test case, a RectangularGeneExpression): in wrangler-collections: fileHandlers/TabSeperatedFile.js : bylineStream never receives any "data" events. Thus parseLine is never called and geneLabelIndex is never initialized to {} or populated but remains undefined. Then, bylineStream receives "end" event and endOfFile is called. This populates sortedGenes: var sortedGenes = Object.keys(this.geneLabelIndex).sort(); Thus resulting in observed error.

Fix: confirm that blob is non-empty at some point before this process (use blob_line_count ?) ; if not, simplest fix is to make the user deal with it: "Empty file uploaded. If your file was non empty please delete this submission and start over. If your file was intended to be empty please don't upload empty files."

e-t-k commented 8 years ago

if we can migrate entirely over to blobs2 instead we can ignore this.