saljam / webwormhole

Peer authenticated WebRTC.
BSD 3-Clause "New" or "Revised" License
1.81k stars 92 forks source link

resume transfers #14

Open saljam opened 4 years ago

saljam commented 4 years ago

it would be nice to figure out a way to resume transfers.

maybe after sending a file metadata, wait for the other side to send an "offset" reply. in browsers, a service worker on the receive side can translate the Range header to the offset.

oridb commented 4 years ago

Do we need to wait for a reply? If we're resuming a file, we already know what we're getting. (But, again, this is sounding a lot like a reinvention of HTTP: is it possible to just run HTTP on this?)

saljam commented 4 years ago

You're right. If the receiver initiates the transfer by asking for a file at an offset then we only need one message. Thinking about the web version, the sender can get prompted to open the file on their end to resume.

This does leave the open question of how to choose which wormhole to ask for the file over if multiple wormholes are open in multiple tabs, since there's only one Service Worker handling the browser's download request.

HTTP on this would be a neat trick, I'd love to see that. I see two challenges:

WebRTC DataChannels are already message-oriented, and browsers have the facilities to make it easy to start with a JSON message and follow with data.

saljam commented 4 years ago

More thoughts on resume and browsers. My thinking was to make it work even if both ends start a new session, but I'm coming around to abandon that approach now.

That's because:

If we maintain some session state (receiver: state of rolling hash, offset, random identifier for the file. sender: state of rolling hash, open File object) across reconnects, we won't have these issues.

I'm nervous about storing this stuff in LocalStorage or cookies. Maybe add a "reconnect" button when a session breaks, and it keeps state from the previous connection. If a user closes the tab then it's gone and they can't resume any more. I can live with that.

The identifier can be embedded in the download URL from the service worker, so it knows which which session to resume from.

None of this applies to the command line tool which can just read the file.

saljam commented 4 years ago

Some thoughts on resuming sessions without having to input a new code.

Assume the signal server, whenever a new slot is used also assigns it a unique "session" identifier. Assume we can exchange messages over this session id the same way we exchange them over a slot.

If a session breaks, both hosts can try to reconnect by exchanging new webrtc session descriptions (sboxed by the same initial key) at the session id, instead of the slot id which should be now long relinquished.

We actually do generate "session ids" (as the ETag to ensure old requests for a slot don't interfere with newer ones) but we can't exchange messages using them alone. That can be fixed.

eitau commented 4 years ago

On Firefox, if you try to resume download using browser UI after connection has been broken, it sends a request bypassing the ServiceWorker (the request doesn't even include range header because requests intercepted by SW are not considered by FF as 'resumable'). Also in general range header is not exposed to the SW : https://github.com/w3c/ServiceWorker/issues/1201#issuecomment-332916761