Closed ahmedh409 closed 2 years ago
The communication library needs to have the following basic features (will be updated and crossed off as they are completed):
I completed the first step toward a communication library. All nodes now have a list of all other nodes. To accomplish this, each node publishes a file to a folder containing their node ID, their port number (a dummy value for now), and a string to indicate that they are done writing to the file (an informal lock). This will work well for local testing.
Achieving this across multiple computers is a significant challenge in its own right, but we won't need to worry about that until at least version v0.3. For now, this is a good starting point, and we can use the global node list without worrying about how it is created.
Here's an idea for the overall flow of how nodes will communicate.
First, when nodes are created, they do some basic setup. Then, they create a TCP socket to listen for incoming messages. The node will have a main loop which runs continuously, and each node will create a separate thread to listen for messages and pass them into the main loop (via some type of message buffer, e.g. a vector that gets added to). Nodes will then all become aware of each other and establish connections (for now each node will connect to every other node, might need to change in the future. Lastly, the main event loop will truly start and nodes will construct a blockchain.
As I've been working on this, I've had to make a few design decisions.
First, as a socket functionality library, we will just use the builtin C API for socket programming. This is more low-level than many libraries, but it has no overhead, gives us direct control over what we want to do, and avoids the decision of picking which high-level library from many options.
Second, the comm
library will be completely stateless. All relevant state will be handled completely by the Node
class and passed to comm
functions when appropriate, so all functions will depend purely on their input and no static or member variables.
Here's a full review of everything I've done so far.
The ultimate goal is to allow each node to communicate directly with every other node. I think the best way to do this is to establish a TCP channel between each pair of nodes, because TCP guarantees messages arrive and that they arrive in order. Maximizing speed does not matter much to us, but maximizing reliability does. In order to establish communication channels, we need to set up a socket for each node to send and receive from.
A socket is just an abstraction for one end of a communication channel. For TCP channels, which require an initial connection to be established before messages can be sent, there must be one server socket, which accepts incoming connections, and one client socket which initiates a connection. Once a channel is established, these roles do not matter, and both the server and client have the ability to send messages. To setup these connections and ensure every pair has a channel, every node creates a server socket to listen for connections, and each node has the ability to initiate a connection with any other node. (This is the best I can think of - there might be a better way.) So far, I have implemented the server socket creation, but I have not yet set up nodes to make connections. This is all still part of the first checkbox in the list above, but it's the most difficult one in that list by far.
A socket requires two values: an IP address, and a port number. For now, since all testing is on a single computer, we are using 127.0.0.1
(localhost) as the IP address, although it shouldn't be hard to extend this to multiple computers. For port numbers, we are using the arbitrarily chosen 12829
(prime), but since we need many ports on the same computer, each node uses the port corresponding to 12829
+ their ID number.
Now, these are the steps to creating a socket:
We need to create a socket, which we do with the function socket()
. We supply flags to indicate that addresses are using IPv4 and that the socket is a TCP (stream) socket. This function returns a file descriptor that is used to refer to the socket going forward.
Next, we need to set up information about the address and port (discussed above) and bind()
the socket to the port. This is how other processes can find our process, regardless of where they are (local or over network).
Lastly (for now), we need to call listen()
to wait for incoming connections. We can then close()
the file descriptor when we are done listening.
There is more to do here.
The first objective (and the most difficult) is complete. Each node now has a list of the IDs of all other nodes and a map from the ID to the contact information of the node. The contact information is a struct in comm.h
containing ID, address (including IP and port number), and a boolean indicating whether or not a TCP connection has been created between the two nodes. The map uses std::map
which is an efficient hashmap implementation.
The next step is to establish a TCP channel between each pair of nodes. This should be fairly straightforward, as all of the setup work has already been done. Now, all we have to do is listen for connections, make a connection, and make sure both parties are aware of the successful connection.
Here I will lay out my logic for establishing the TCP channels. We have one major assumption: each node knows of the existence of every other node and knows their IP address and port number. This assumption will need to be relaxed in future versions.
First, every node will create a new thread to handle all incoming communication. That is, we will call listen()
and recv()
in a separate thread, since they need to operate asynchronously, and this seems like an easy way to do this. The node will have a message buffer and corresponding lock, and the listening thread will add incoming messages or connections to this message buffer. There will a way to identify whether a message is an incoming connection or a message from an established connection, and the node will process these messages in its main loop with a method process_messages()
.
When it comes to establishing connections, some node will have to initiate each connection, as not all can just sit by and listen. It is important to make sure that 2 nodes don't establish 2 connections between them. I don't believe this will be difficult. I think if 2 nodes each initiate a connection, it should be possible to make sure only one channel is created. Ultimately, all that matters is that each node is capable of sending and receiving messages from each other node.
A lot of what's written above is wrong lol. First of all, the global node list was CERTAINLY NOT the most difficult step - that's the TCP channels. I had so many issues setting them up that I completely switched ideas, and now we're using Boost::Asio
. It also has taken so long that I made a separate branch to work on it so it doesn't clutter the main for so long. I'm finally back to working on this, so I'm trying to have this done soon.
Comms is finally done!!!!!! That was rough.
The only minor thing left to do is allow broadcasting messages to all other nodes, but this is trivial to implement and will be done for the simulation, so I'm closing this issue now.
Develop a library called
Comm
which will handle the details of communication between nodes within the simulation. The simulation will use these library functions to handle uploads and searches of the blockchain.