Open kingsrd opened 6 years ago
@kingsrd I've been thinking about this for a while myself, and if I were to tackle this problem, I would keep them conceptually separate.
Instead of implementing NNG/nanomsg/zeromq or similar on top of libdill
, I would first implement the abstract protocol-handling logic for them in an IO-agnostic way, so that it just worked on pointers to memory representing data either read from the network or written to the network.
Then any IO implementation, including libdill
's protocols or native BSD sockets or whatever, could be used to send and receive data, and any multitasking/concurrency/etc implementation, including libdill
's coroutines or some event loop or just plain worker threads or separate processes sharing memory, could be used to schedules those reads and writes and protocol-handling logic.
The more I think about it the more I think that the very idea of always implementing protocol logic and IO logic as one abstraction onion is worse than implementing them as two separately reusable abstractions that are agnostic of each other.
Which I realize isn't really an answer to the question, and I've never contributed anything to libdill
or worked on its internals so I should disclaim that I'm not any sort of authority on the matter.
But my point is that maybe asking about whether or not libdill
makes a good base for implementing a protocol is thinking about it the wrong way: a single polished protocol implementation that is IO- and scheduling- agnostic can be created without having to answer that question, and then if it turns out the answer is "yes", it should be easy enough to have a small glue project that just exposes the protocol using libdill
's interfaces.
can libdill be a good candidate for re-implementing nanomsg?
I think so @kingsrd. Also libmill.
That's why Fatih Kaya @fatihky and I wrote the nnmill experiment
We found these libraries incredibly useful and wanted to see whether you could send and recv nanomsg wire protocol while coordinating kernel I/O operations from simple and performant coroutines:
nn_getsockopt
to pluck a nanomsg fd
fdin
or fdout
, libmill had an fdwait
int nn_mill_getfd (int s, int fdtype) {
int rc, fd;
size_t fdsz = sizeof fd;
if ( nn_getsockopt (s, NN_SOL_SOCKET, fdtype, &fd, &fdsz) != 0 )
return -1;
return fd;
}
@mentalisttraceur,
then if it turns out the answer is "yes"
turns out you don't have to rewrite one line of nanomsg. nnmill experiment's example.c
connects the sender and receiver coroutines to lib's go()
:
int main(int argc, char *argv[]) {
int s = nn_socket(AF_SP, NN_PAIR);
int s2 = nn_socket(AF_SP, NN_PAIR);
int rc;
chan endch = chmake(int, 1);
nn_bind(s, "tcp://127.0.0.1:7458");
nn_connect(s2, "tcp://127.0.0.1:7458");
go(sender(s));
go(receiver(s2));
chr(endch, int);
return 0;
}
coroutine sender
used msleep
to wait before sending additional messages in example.c
:
coroutine void sender(int s) {
int rc;
for(;;) {
rc = nm_send (s, "test", 4, 0, -1);
if (rc != 4)
break; // TODO: print error
msleep(now() + 1000);
}
}
coroutine void receiver(int s) {
char buf[5];
int rc;
buf[4] = '\0';
for(;;) {
rc = nm_recv (s, buf, 4, 0, -1);
if (rc != 4)
break; // TODO: print error
printf("recevied: %s\n", buf);
}
}
What's nm_recv
? Well that's a nanomsg_mill_recv
operation. Whether you're doing send or recv, you'll want to switch between nanomsg flags NN_RCVFD
and NN_SNDFD
to unlock the fd
from getsockopt. Looks like we could skip using system flasgs like FDW_IN
and FDW_OUT
with libdill's fdin
and fdout
int nm_recv (int s, void *buf, size_t len, int flags, int64_t deadline) {
int fd = nn_mill_getfd(s, NN_RCVFD);
int events;
int rc;
if (flags == NN_DONTWAIT)
return nn_recv(s, buf, len, flags);
events = fdwait(fd, FDW_IN, deadline);
if (!(events & FDW_IN))
return EAGAIN;
rc = nn_recv(s, buf, len, 0);
return rc;
}
I want to use the nng with libdill, but I found the nng not work correctly in libdill environment.
static dill_coroutine void
test_pub(void)
{
nng_socket sock;
nng_msg *msg;
int ret;
nng_pub0_open(&sock);
nng_listen(sock, "ipc:///tmp/pub_sub.ipc", NULL, 0);
while(1)
{
ret = nng_send(sock, "1", 1, 0);
printf("1: %d\n", ret);
ret = nng_send(sock, "12", 2, 0);
printf("2: %d\n", ret);
ret = nng_send(sock, "123", 3, 0);
printf("3: %d\n", ret);
ret = nng_send(sock, "1234", 4, 0);
printf("4: %d\n", ret);
dill_msleep(dill_now() + 1000);
}
}
static dill_coroutine void
test_sub(void)
{
nng_socket sock;
int fd, ret;
nng_msg *msg;
uint8_t *buf;
size_t sz;
nng_sub0_open(&sock);
nng_setopt(sock, NNG_OPT_SUB_SUBSCRIBE, "", 0);
nng_dial(sock, "ipc:///tmp/pub_sub.ipc", NULL, 0);
nng_getopt_int(sock, NNG_OPT_RECVFD, &fd);
while(dill_fdin(fd, -1) >= 0)
{
nng_recv(sock, &buf, &sz, NNG_FLAG_NONBLOCK | NNG_FLAG_ALLOC);
printf("recv: %d\n", sz);
nng_free(buf, sz);
}
}
int
main(void)
{
dill_go(test_pub());
dill_go(test_sub());
dill_msleep(-1);
return 0;
}
Run output as follow:
1: 0 2: 0 3: 0 4: 0 1: 0 2: 0 3: 0 4: 0 recv: 1 1: 0 2: 0 3: 0 4: 0 recv: 1 recv: 3 1: 0 2: 0 3: 0 4: 0 recv: 1 recv: 3 1: 0 2: 0 3: 0 4: 0 recv: 1 1: 0 2: 0 3: 0 4: 0 recv: 1 1: 0 2: 0 3: 0 4: 0 recv: 1 ...
The correct output should be:
1: 0 2: 0 3: 0 4: 0 recv: 1 recv: 2 recv: 3 recv: 4 1: 0 2: 0 3: 0 4: 0 recv: 1 recv: 2 recv: 3 recv: 4 ...
I can not find the reason, please give some suggestion.
I read about the motivation of NNG re-implementing nanomsg in addressing problems e.g. scalability/reliability because of the heavy use of state machine. Just out of curiosity (of your opinion), can libdill be a good candidate for re-implementing nanomsg?