Open juliangruber opened 9 years ago
The problem is tracking which files are going over which channel.
that's not a problem. we could just split the array of files into chunks and assign a chunk to each connection. very similar to how it works now, just with multiple connections.
how does the server know how many connections to expect?
over every connection, the simplified protocol could be | file meta | file content |
no need to actually split files into chunks or to multiplex individual connections
and i think the number of connections we'd need to benchmark. this could actually be a really sweet QoS module
how do you not send the same file?
btw we are using 17k files around 8mb totaling around 120gb as a test.
Im cool with whatever as long as its fast ;)
On Monday, March 9, 2015, Erik Kristensen notifications@github.com wrote:
how do you not send the same file?
btw we are using 17k files around 8mb totaling around 120gb as a test.
— Reply to this email directly or view it on GitHub https://github.com/hij1nx/cp-mux/issues/3#issuecomment-77909473.
Sent from Gmail Mobile
Are .sst
files responding well to compressing?
That was like the worst piece of english ever. But I think you know what I mean :)
hahaha :) <3
Why can't the client just do a simple HTTP GET /backup
? And then after getting a response continue to do HTTP GET /backup/000117.sst
with like 20 connections at a time?
Or use 20 keep alive connections rather.
I think that is what @juliangruber is working on right now.
Nice!
ok so @ekristen said it turned out to be about the same (1 hr). What about using this module but udp
as a transport instead of tcp
?
hmm, isn't tcp just udp with flow control and some more mechanisms to make it more reliable?
in the end the limit we hit probably is the max outgoing network bandwidth from the server and the max incoming network bandwidth from the client
@juliangruber yes, tcp
trades speed for guaranteed delivery of packets in the order you send them. In this lib
, I measured logging json over udp vs tcp (localhost isn't a good measurement so you need to actually set up some servers). But, I agree, I'd say what we have is "good enough" to move forward for our needs.
when data grows we'd probably be better off looking at automatic sharding, so there's less data to distribute
I think it would be faster to add parallelism by transfering files over multiple connections, not sending multiple files through one.