flowjs / flow.js

A JavaScript library providing multiple simultaneous, stable, fault-tolerant and resumable/restartable file uploads via the HTML5 File API.
Other
2.96k stars 346 forks source link

Convert codebase to asynchronous code #354

Closed ilessiivi closed 2 years ago

ilessiivi commented 2 years ago

The way that the async functions are currently implemented has a couple of issues.

  1. Having both the synchronous and asyncronous versions of the same functions adds complexity to the codebase and makes flow.js more difficult to maintain and test.

  2. For the cutting edge projects, that want to use async functions without a care about older browsers, and which don't transpile their own code, the transpiling of flow.js adds quite a lot of unnecessary weight.

  3. For the projects that would like to implement the async functions and transpile the final code themselves, the way that the async-ness is detected with fn.constructor.name === "AsyncFunction", will not work (as mentioned already in #319) due to the way the async code usually ends up being transformed into regenerator functions.

In both of these imaginary projects, the alternative versions (synchronous in the first example, and async in the second) of the same functions are unnecessary to have.

What I would like to have is just the async set of functions (async addFile(s), async generateUniqueIdentifier, async initFileFn and async readFileFn without extra prefixes in their names), and async hooks and events. Async hooks would always be called with await, async events would not.

The most obvious breaking change for upgrading users is that addFile(s) would no longer return the added file or the list of files, but a Promise object containing them instead. For the higher level assignBrowse/assignDrop function users, the amount of changes needed to upgrade might actually be zero, since none of the hooked functions need await – now they just can have them.

I'm thinking that a third prepackaged version of flow.js would need to be distributed:


Still missing from this PR is the unifying of readFileFn and asyncReadFileFn – the latter would need to be renamed into the former.

Also I'd like to get some feedback on the two lines of code marked with // TODO in Flow.js regarding the changed filtering behavior of addFiles between v2 and v3. Is this what the console warning in Eventizer.js was added for? Perhaps the old behavior can still be saved with a cleaner solution?


I know there's a lot to go through here – sorry to drop such a massive PR on you @drzraf and @AidasK 😅

In case this isn't the direction either of you want to go with, I'm happy to submit individual pull requests on the more minor changes that have nothing to do with async/await.

AidasK commented 2 years ago

Welcome @ilessiivi to this project. I am totally on board with everything you have mentioned and this is the direction I would like to see this project going. You have deleted many lines and that’s awesome. Sync functions are a thing of the past and we should get rid of them.

I have reviewed your changes and it looks great, though @drzraf is more familiar with async changes, so let’s wait for his review too.

drzraf commented 2 years ago

1. About backward (in)compatibility

I agree with your points 1 and 3. About point 2, I doubt an hypothetical project which does not transpile its own code is significant. It's provided a second argument to drop backward compatibility. (If you assume bleeding-edge projects do transpile, they could import an hypothetical AsyncFlow class as well (and tree-shaking would remove all dependency upon older Flow.js code) that would allow keeping synchronous codebase as well.

Even though I went on great efforts to maintain compatibility, I add to break event names. So v2 users can't actually use v3. But quoting https://github.com/flowjs/flow.js/issues/319#issuecomment-758639954 An async-only codebase force users handling regular files (the majority) to use a Promise-based approach with no gain:

I'd agree with the voice of the majority (and understand all the possibility of further evolutions brought by such a cleanup), but I want to be sure these compatibility aspects are well understood/taken into account before synchronous behavior be totally dropped.

An alternative could have been to inherit / subclass Flow class, or similar code inheritance / mixins constructs to would more clearly separate both flavors. Don't you think there could be room for such a solution?

2. About the PR itself

It's a lot of nice individual commits, but bundled as a pretty big PR. Since it's already an deep functional change, it's best to not include anything not strictly related.

Do you mind creating the distinct PR for these individuals changes:

Once merged, you could rebase on top of latest v3 branch (include #350) and squash 9d94015d, 682090 and 7906f7c with 6820906d. The resulting list of commits (and overall diff) would be way more easier to review and test (from 20 down to 6 or 7 commits)

AidasK commented 2 years ago

In my opinion code base has to be as simple as possible so that it would be easier to develop it and to ensure that it’s bug free. So dropping sync approach would make it much more simpler and understandable for us and for future users. We don’t have much development power anyway, so reducing our code base might be beneficial for us. Also, I think it’s fine to use await/wait/then in all the cases where this approach might be needed. Upgrade might be hard, but for new users it’s no brainer. It’s easier to force it async than to redo-it to async later. So providing both approaches also might be frustrating for our users as they would have read the docs about both of them.

Flow.js v2 is stable enough and handles all required features and there is no need to upgrade it to v3 if it works. So I don’t think we should think much about backward compatibility. From my experience legacy code only holds community back, so we should not fear upgrading this.

@drzraf I know you have spent a lot of time maintaining backward compatibility, so I am leaving this decision for you as you know better if it hard or easy to maintain it. Also, great review!

ilessiivi commented 2 years ago

Thank you @drzraf, this is all very valuable feedback. Thanks @AidasK too for going through it!

About point 2, I doubt an hypothetical project which /does not transpile its own code/ is significant

I was thinking that a quick-and-dirty personal project or internal webpage might want this, but I suppose the file size wouldn't be a major consideration for them anyway and the transpiled and minified package will suffice, so fair enough.

Regarding how surprising certain behaviors are or how tricky the library sounds to use depends largely on the sample code provided, and granted, I did not document any of my changes yet in the readme or changelog. The documentation doesn't have to start with the complicated stuff though – the current simplest example with assignBrowse, assignDrop and hooks/events will work going forwards too.

I don't agree that there's no gain to awaiting in some of the hooks. The hooks become async only if the user decides to use async code in them, and I'm seeing just some new possibilities that this opens up: validating a single file in file-added or all at once in files-added, or pinging the server about a client-side filtered list that it can expect to receive soon in files-submitted.

I'm not saying that there are no downsides for upgrading users, but in my opinion those are manageable. When addFile or addFiles is called directly, it will need to have await or .then, and that code possibly needs to be transpiled, sure. Same for uploadNextChunk, later generateUniqueIdentifier, AsyncFlowFile's bootstrap and FlowChunk's uploadStreamChunk if the user needs their results after calling them manually. But that's all, isn't it? Everything else about async is opt-in.

That said, I don't want to make assumptions about the userbase of flow.js – I only heard of this project just two weeks ago and I've no idea what others are using it for. I can only speak for myself, my usage has been purely event-based in a web app where none of the async stuff could have a negative effect.

I wouldn't be opposed to having an AsyncFlow class (that in turn uses an AsyncEventizer) either, but it certainly wouldn't simplify the codebase. Switching to native async/await could reduce the amount of code and subclassing would mean rearranging what we currently have. For the flow.js user that ends up being just a different kind of an opt-in, albeit not a forced one.


On commit 64ab7fe and always emitting the catch-all event, that matches the behavior of v2, the synchronous hook of v3, and the way it's currently documented. I rewrote the function in the if-else way to avoid repeating the emitCatchAll() line, and to avoid changing the order in which the hook and event are fired: the hook's custom callback first and catch-all last. I've left it there for now as a duplicate call. (It'll be necessary to have a duplicate call after this PR.) (#359)

Let's leave 20bbf4c (async generateUniqueIdentifier) for later along with combining readFileFn/asyncReadFileFn since it's not an absolutely necessary change for the basic async functionality.

The rest of the PRs (#356, #357, #358, #360) are now submitted on the v3 branch. Thank you for laying them out so well!

drzraf commented 2 years ago

I hope you the best for the rebase. I'm eager to review this PR in its purest new shape.

drzraf commented 2 years ago

I went into the hurdle of rebasing your PR and ended up with #363 You'll find the substantial core of this PR inside 8ec1e55fdd on top of which I'm now polishing. If it sounds satisfying we could close this one which got outdated and continue working there?

(I'll wait for your confirmation)