flowjs / flow.js

A JavaScript library providing multiple simultaneous, stable, fault-tolerant and resumable/restartable file uploads via the HTML5 File API.
Other
2.96k stars 346 forks source link

Can't upload empty file when using asyncReadFileFn hook [v3 regression] #375

Open bertrandg opened 2 years ago

bertrandg commented 2 years ago

Hi,

I'm using V3 branch since multiple months and it works well but i've discoved this small regression. I'm unable to send empty files when using asyncReadFileFn hook.

Investigating, it seems to come from there: https://github.com/flowjs/flow.js/blob/5591263e71426fb811f9495457380f81798a09af/src/FlowChunk.js#L341

From what i've seen convert to this fix it: if (data && (data.size > 0 || this.fileObj.size === 0)) {

But i would like to have a feedback from you @drzraf before making an MR.

AidasK commented 2 years ago

Maybe you could create a failing test to demonstrate this?

drzraf commented 2 years ago

From a quick look, the problem is that we must determinate between an empty regular file and a stream which finished reading (thus returning 0)

Why are you using the asyncRead feature/codepath with a simple regular file (for which I believe most processing hook could be synchronous)?

bertrandg commented 2 years ago

@AidasK yes I will try to add one.

@drzraf I'm not sure to understand, i'm using the asyncReadFileFn hook when I create my flowjs instance to being able to generate a sha1 hash of the chunk 'on the fly' (and soon to encrypt data) and it works well except for empty file sadly, I'm not using stream..

bertrandg commented 1 year ago

@AidasK @drzraf Hello, long time since last message, but back on it now! 😅

I'm currently playing on a fork > https://github.com/flowjs/flow.js/compare/v3...bertrandg:flow.js:v3 But when i build/commit it and then add it to my package.json like this: "@flowjs/flow.js": "git+https://github.com/bertrandg/flow.js.git#6b1cf6262160a35f344770e9c27a5970abb7f733",

It doesn't work at all, uploads never starts and no error displayed.. I just build dist folder using grunt exec:build, any idea what I'm doing wrong? Maybe some node/grunt/rollup/.. versions are bad but I don't know..

@drzraf Here is my usecase, I want a hash from each chunk:

asyncReadFileFn: async function asyncReadFileFn(flowFile: FlowFile, startByte: number, endByte: number, fileType, flowChunk: FlowChunk) {
    const bytes = await readFileChunk(flowFile.file, startByte, endByte);
    const hash = await getDataDigest(bytes, 'SHA-512');

    const f = this.filesDetails$.value[flowFile.uniqueIdentifier];
    const chunkIndex = flowFile.chunks.indexOf(flowChunk);
    f.chunkHashs[chunkIndex] = hash;

    return new Blob([bytes], {type: 'application/octet-stream'});
}
drzraf commented 1 year ago

Quoting from https://github.com/flowjs/flow.js/issues/353#issuecomment-1028498996

354 (#363) broke the testsuite but more importantly cause possible problem regarding asyncReadFileFn (see #368)

So the code was put in a instable stable at that commit.

This was a huge commit, it changed the semantics, the regression wasn't spotted soon enough but, more importantly, fixing it implies deeply architectural changes. (The problem was that going full-async would break the behavior inside xhr.request listeners and the recursion mechanism)

I'm sorry I didn't have the time to dig into this again more than when I stopped at #368 (when I understood that relying upon XHR recursive handlers was inherently incompatible with the async changes already merged in #363)

bertrandg commented 1 year ago

Hello @drzraf, Thanks for your answer

Yes I know huge v3 refacto change/break many things but that's not directly related to my issue there because I compare to a working (at least for my usecases) version using latest V3 branch commit : package.json > "@flowjs/flow.js": "git+https://github.com/flowjs/flow.js.git#5591263e71426fb811f9495457380f81798a09af"

So my current issue is more related to the build process, how did you build the lib (dist folder) in latest commits on v3 branch? Maybe I'm missing a step..

drzraf commented 1 year ago

You're right. dist/* were not regenerated (and committed) lately. Will consider that if we ever fix #368