Open laijs opened 8 years ago
it seams serial port is not reliable. (Is virtual serial port reliable?) so we must use SLIP as a lower protocol (lower than above "stream message format") or we must use PPP directly? Are there any simple better protocol? we only need a protocol that add reliable frame and multi-stream over serial port.
we are also going to add vsocks for the streams, but we need to support both (serial & vsocks) in future.
One general comment would be to make sure the protocol version is changed when it changes in an incompatible way.
We've also thought about VSOCKs for vm <-> host communication, but given it's availability so far, I'm afraid we need to stick with serial for some time.
@laijs Could you elaborate on the "serial port is not reliable" statement ? That might be the case with physical serial lines, but virtio-serial is reliable as far as I know.
By using PPP (I would not use SLIP) you assume the guest as networking enabled and that the PPP link will not interfere with some potential firewall rules within the guest. It would be nice (TCP based communication) but it will bring additional guest requirements imho.
As @dlespiau said, bumping the protocol version when introducing backward incompatible changes would be nice.
@laijs One more question: In which situations do you see the hyperstart or host buffer being full ?
@sameo, I have very strong confidence on "virtio-serial is reliable", but I can't find the reference about it. Ok, we can trust it.
but virtio-serial is still single stream. we need to applied mulit-streams protocol on it. mulit-streams messages are encoded with stream-id(we use the name stream-seq, due to it increases only) and write on the serial. when the receiver side is in the hyperstart, and the buffer is full, the data will be discarded. if the receiver side is in the runv, and we directly write the data to the downstream io.Writer, if the io.Writer's buffers(including the kernel site) is full, the write operation will blocked, and all the stream will be blocked. we can fix it by extending the buffer size. but we can't extending the buffer endless.
In which situations do you see the hyperstart or host buffer being full
cat bigdata | hyperctl run ubuntu sh -c "cat>bigdata; md5 bigdata" hyperctl run ubuntu cat bigdata > bigdata
the stream buffer in hyperstart is only 512 bytes.
the above multistream protocol uses the way "discard when buffer" which needs ack. we can change the way to "send data only when requested", it is much simpler protocol. I will write comments about this way.
Current streams are multiplexed and sent/received via the serial port (named: "sh.hyper.channel.1") And the format for multiplexing is:
Both stream sequence and length are encoding in bigendian. And length=0 indicates the command of close the stream (only one direction) (and there is an additional ugly data is sent from hyperstart to runv for the exitcode, this part of the protocol will be removed soon(scheduled after the big refactor(hyperhq/runv#295)))
The major problem is that payload will be discarded in hyperstart if the buffer if full, and stream service is blocked in runv if the buffer if full.
We need change the protocol after the big refactor(hyperhq/runv#295) as:
0) we assume serial port doesn't discard any data, and data is received as the same order as it was sent. 1) length(decoded) doesn't include the length of the header 2) the length of the payload must less than (1 << 30) (the practical length < 4096) 3) when A received length(decoded) == 0x80000000 | num. ACK-COMMAND: it means the opposite(B) side had just received and consumed
num
bytes of data(ack for the earlier message). A should record how much the data that B had received. A shouldn't send any more data to B util A get ALL the ack. 4) when A received length(decoded) == 0xC0000000. CLOSE-COMMAND: the B side close the stream, it meems the upstream fd in B is closed, the B will not send any message with payload, A should close the corresponding downstream fd if needed. 5) when A received length(decoded) == 0xC0000001. REQUEST-COMMAND: the B side request data, A should send data to B from the first unacked data. (B had discarded the unacked payload).