tus / tusd

Reference server implementation in Go of tus: the open protocol for resumable file uploads
https://tus.github.io/tusd
MIT License
3.09k stars 482 forks source link

Experiment with distributed upload locks #917

Open Acconut opened 1 year ago

Acconut commented 1 year ago

Currently we only support memory and file locks, whose reach is limited to a single machine. If you want to properly scale horizontally, a distributed lock is needed, such as Redis, Consul, etcd etc.

@Murderlon proposed the use of https://github.com/matryer/vice, which wraps multiple messaging services into a single interface. Maybe that is helpful. More infos at https://medium.com/@matryer/introducing-vice-go-channels-across-many-machines-bcac1147d7e2.

Murderlon commented 1 year ago

After a small attempt myself, I don't think vice is perfect for this use case. Redis can be used as a message broken but it will only be in-memory and no guaranteed arrival confirmation. Meaning, if uploader A sends a lock message over Redis and uploader B isn't listening at the time, it's stuck forever. The other message queue technologies supported by vice likely don't have this problem

In the case of Redis, they designed an algorithm specifically for this called RedLock. Which I'm going to try soon.

Acconut commented 1 year ago

Thanks for the feedback! RedLock does look interesting, so let me know if you have some more insights.

gedw99 commented 1 year ago

https://github.com/matryer/vice Never happened in terms of nats integration for global concurrency use cases .

Instead Nats has this ability and it can be embedded. It’s just golang

the lock is just a kv in nats .

kine which is an etcd cluster also uses nats to a hieve global state.

Acconut commented 1 year ago

Thanks, NATS does also look interesting on its own.

gedw99 commented 1 year ago

@Acconut

https://github.com/maxpert/marmot is a good starter.

It's designed to do data replication between SQLITE databases, but the concepts apply for tracking locks etc. It also embedded the NATS Server if you want.

The other way is that NATS has an Object Store, that can hold the chunks per user upload. When the upload is done, then you can extract it all out. https://github.com/a-h/natsjson uses the KV, which is a lot like using the Object Store.

Acconut commented 1 year ago

Interesting links, thank you. Let us know if you are interested in trying out NATS for tusd's locks and we can assist you :)

gedw99 commented 10 months ago

I am interested because I want to use it with Pocketbase, so that uploads can be resilient

https://github.com/maxpert/marmot/issues/101 Is a brainstorm on doing it with NATS. It already does SQLite replication and file replication is easy to add.

The upload aspect is also handled there but I don’t know if the tusd protocol is conflating things tbh.

Acconut commented 10 months ago

I am interested because I want to use it with Pocketbase, so that uploads can be resilient

Great idea!

The upload aspect is also handled there but I don’t know if the tusd protocol is conflating things tbh.

Not sure what you mean by this. Do you want to handle file uploads directly via NATS? If so, the tusd might not be necessary for you.

gedw99 commented 10 months ago

I am interested because I want to use it with Pocketbase, so that uploads can be resilient

Great idea!

The upload aspect is also handled there but I don’t know if the tusd protocol is conflating things tbh.

Not sure what you mean by this. Do you want to handle file uploads directly via NATS? If so, the tusd might not be necessary for you.

Yes you got it. NATS on the client side will chunk the file ( or any data) up to the NATS server. If the client goes down , when it comes back up it will ensure the file or data eventually gets to the NATS server. Once it gets to the Server it can then push it to Tusd or anything else. NATS in this case is acting as a Plane on top of Tusd.

NATS can currently used with Pocketbase to allow all data transacted to the Pocketbase SQLite DB to be scaled out to all others Pocketbase Sqlite servers. But its file uploads are scaled out.

Acconut commented 10 months ago

Interesting approach, but this issue talks about a different use case for distributed systems. We want to look into how a distributed system can be used to lock upload resources and ensure exclusive access in the presence of multiple requests (see https://github.com/tus/tusd/blob/main/docs/locks.md). What you are proposing is something different AFAIU

gedw99 commented 10 months ago

thanks for the link @Acconut . I read it pretty quickly and it makes sense.

"the locks do not extend to every server.". You can use NATS to do this so all servers have full knowledge of any LOCK.

See k3 and Kine which uses NATS jet stream to do this also.

NATS will decouple things.

You know the problem domain much better than me however.....