tus / tus-node-server

Node.js tus server, standalone or integrable in any framework, with disk, S3, and GGC stores.
https://tus.io/
MIT License
795 stars 197 forks source link

@tus/server: add GCS locker #616

Open netdown opened 4 months ago

netdown commented 4 months ago

This PR is not complete yet, it misses unit tests (the code is tested), readme updates and changeset. Despite all that, I would like to ask you to review my approach first so I won't write needless tests. I have documented the process in detail, but feel free to ask questions.

changeset-bot[bot] commented 4 months ago

⚠️ No Changeset found

Latest commit: 0ce3a90574c4e1cff321f7279461a33baa1d1ce3

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

netdown commented 3 months ago

Indeed I haven't even thought about using this locker with a store other than GCS. In my case, the storage bucket and the locker bucket is the same, and I think the only case they should be separated is when the storage bucket is not in standard storage class.

Anyways, I'm not sure i.e. Firestore would greatly overperform GCS in case of different storage. Regarding region latency, the user should be aware of that and choose a suitable region. Of course a redis based implementation would be much better, but this may be a considerable alternative until thats not implemented.

Shall I move this locker to the gcs-store package to suggest the primary application?

netdown commented 3 months ago

GCS is strongly consistent, but indeed concurrency was not ensured in my previous approach. I have reworked the code based on this article.

Note that I had to upgrade @google-cloud/storage because previous version was missing a type export. Also, this feature should be moved to a separate package or into gcs-store, as I'm importing from @google-cloud/storage.

Murderlon commented 3 months ago

Really nice article, thanks for sharing. It does also say this:

A distributed lock like this is best suited for batch workloads. Such workloads typically take seconds to tens or even hundreds of seconds.

But here we are using it for individual uploads, not batches. Or even smaller with a resumed uploads (or where a client sets chunkSize on the client). It's probably fine, but hard to tell when this becomes a bottle neck. Have you ran some tests with this locally to your satisfaction?

netdown commented 3 months ago

For the last 10 days it has been running in production without problems. We have about 5000 uploads per day. In e2e tests it was indeed slightly slower for 140 files compared to xhr, but I could easily compensate this by increasing the number of parallel uploads. If I measure individual uploads, the time elapsed between lock and unlock is mostly 20-400 ms in case of memory locker, and 300-400 for gcs locker.

Murderlon commented 3 months ago

That's great to hear! I'm in favor adding this into the package then.

Acconut commented 3 months ago

GCS is strongly consistent, but indeed concurrency was not ensured in my previous approach. I have reworked the code based on this article.

Thank you for the article, I will have a look at it! I am wondering if S3 has similar capabilities and a locker can be implemented nowadays ontop of it as well.

Murderlon commented 2 months ago

@netdown still interested in getting this over the finish line?

netdown commented 2 months ago

Yes, but I've been busy the last few weeks and I expect the same at least until July. Feel free to complete the PR if you have the time.

socket-security[bot] commented 1 month ago

New and removed dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
npm/@google-cloud/storage@7.12.0 environment, filesystem Transitive: network, shell +67 6.11 MB google-wombot

🚮 Removed packages: npm/@google-cloud/storage@6.12.0

View full report↗︎