haskell-distributed / network-transport-tcp

TCP Realisation of Network.Transport
http://haskell-distributed.github.com
BSD 3-Clause "New" or "Revised" License
30 stars 27 forks source link

Explicitly use Word32, avoid use of Enum #51

Closed avieth closed 7 years ago

avieth commented 7 years ago

This patch eliminates the use of signed integers and makes the text more explicit about what is being decoded from the wire. recvInt32 is not so clear about this: it would in fact receive a Word32 almost always, except in the case of control headers and connection request responses, which would read a signed Int and pass it to tryToEnum (recvInt32 actually receives any Num type).

Similar story for encoding: encodeInt32 would acutally encode any Enum by using fromEnum and encoding the signed Int, but now we encode a Word32 explicitly.

Haddock says that Int is only guaranteed to have range [-2^29, 2^29 - 1]. But LightweightConnectionId = Word32 so we could have a problem in handleIncomingMessages: the peer could give a valid connection id which is interpreted as a control header. With this patch it's fine: the Word32 won't be cast to an Int.

ControlHeader and ConnectionRequestResponse have been moved to the internal module, and are encoded/decoded explicitly rather than through their Enum and Bounded instances, so that we don't have to use any signed integers ever.

facundominguez commented 7 years ago

To make sense of the problem this is trying to fix, I think I would need a failing test, or at least a concrete example.

avieth commented 7 years ago

To make sense of the problem this is trying to fix, I think I would need a failing test, or at least a concrete example.

After a second look, handleIncomingMessages is actually ok: recvInt32 is receiving a Word32, which is only cast to Int and fed to tryToEnum in case it's less (as a Word32) than firstNonReservedLightweightConnectionId = 1024 and so the cast is fine.

The other use of tryToEnum, in handleConnectionRequest, maybe isn't ok. I can't give a failing test because my machine (and probably every machine we'll ever run this on) has 64-bit Ints, but according to the 2010 report and the GHC base docs an Int is only guaranteed to range [-2^29, 2^29 - 1]. So it's possible for a peer to commit a protocol error without the receiver noticing: send a Word32 where the most-significant two bits are anything, but the final 30 bits are 0, 1, or 2 when read as a signed 30-bit integer. Similar story if Int is 31 bits wide. It's a corner case, sure, and AFAICT it's not even possible to compile GHC with an Int narrower than 32 bits, but it's so simple to avoid coercing to an Int at all.

Anyway, this patch mainly aims to make the protocol more explicit in the program text. If we're sending or receiving a Word32, we say so explicitly. If we're encoding or decoding something to or from a Word32, we give the coding explicitly rather than relying on the Enum instance and therefore unnecessarily factoring it through Int.

facundominguez commented 7 years ago

I find the encoding and decoding of Ints that N.T.Internal does rather confusing.

Regarding this patch, encodeWord32 is calling encodeInt32 which is still doing conversions to Int and back to Word32.

Wouldn't it be more easy to follow an implementation of encodeWord32 and decodeWord32 as functions without class constraints. As this is what goes on the wire, I'd argue any generalizations should step on them, and not the other way around.

avieth commented 7 years ago

Wouldn't it be more easy to follow an implementation of encodeWord32 and decodeWord32 as functions without class constraints. As this is what goes on the wire, I'd argue any generalizations should step on them, and not the other way around.

Yes I'm in favour of this. Currently we have in Network.Transport.Internal

decodeInt32 :: Num a => ByteString -> a
encodeInt32 :: Enum a => a -> ByteString

both of which use some Data.ByteString.Internal and Foreign.Storable definitions to read or write a CInt. It would be trivial to pull out the use of fromIntegral and fromEnum to get

decodeWord32 :: ByteString -> Word32
encodeWord32 :: Word32 -> ByteString

Well, we would still need a fromIntegral to move between Word32 and CInt (which is an Int32).

Here's a network-transport patch.

facundominguez commented 7 years ago

The n-t patch looks good in general. The fromIntegral calls can be avoided by using the Storable Word32 instance to peek and poke.

avieth commented 7 years ago

Here's the pull request https://github.com/haskell-distributed/network-transport/pull/34

avieth commented 7 years ago

@dcoutts Changes made. Latest network-transport exports are used, and I've aligned those case alternatives.

avieth commented 7 years ago

Checks fail of course, as they're using the released network-transport.

qnikst commented 7 years ago

Previously we used HEAD versions for testing, I suppose we need to define some sane strategy here. As while libraries are under heavy development we will have to either bump them every second day or have such kind of problems..

facundominguez commented 7 years ago

I uploaded a new release of n-t.

facundominguez commented 7 years ago

@avieth you'll have to change the bounds of n-t in the .cabal file.

avieth commented 7 years ago

Uses of <$> in network-transport 0.5 cause the old GHC builds to fail.

facundominguez commented 7 years ago

Thx @avieth!