sid-project / sid

Storage Instantiation Daemon
https://sid-project.github.io
12 stars 5 forks source link

Buffer is using 'write', but we need to use 'sendmsg' (wrapped by comms_unix_send) for ancillary data #111

Open trgill opened 3 years ago

trgill commented 3 years ago

Address FIXME in code:

https://github.com/sid-project/sid/blob/cf867f83c90a46aab3f9434896ff818325d9fbac/src/resource/worker-control.c#L942

             * FIXME: Buffer is using 'write', but we need to use 'sendmsg' (wrapped by comms_unix_send) for ancillary
             * data. This is why we need to send the ancillary data separately from usual data here. Maybe extend the
             * buffer so it can use 'sendmsg' somehow - a custom callback for writing the data? Then we could send data
             * and anc. data at once in one buffer_write call.
             */
trgill commented 3 years ago

@prajnoha At first glance I was finding this a bit confusing. I get it now I think. When I was looking at:

https://github.com/sid-project/sid/blob/cf867f83c90a46aab3f9434896ff818325d9fbac/src/resource/ubridge.c#L4175

    data_spec.data      = NULL;
    data_spec.data_size = 0;
    data_spec.ext.used  = true;

It had me looking around to figure out what was being sent.

Do you think adding _chan_buf_send_socket() would be an improvement? (I'm not sure)

Would you please elaborate on the comment at:

https://github.com/sid-project/sid/blob/cf867f83c90a46aab3f9434896ff818325d9fbac/src/resource/worker-control.c#L910

prajnoha commented 3 years ago

Thing here is that "main process" <--> "worker" channels have buffers to send/receive data as a whole (and we're using our buffer code for this). Since buffer code is generic, it uses pure read and write calls, because it doesn't know anything (yet) about what type of fd it is working with when buffer_read and buffer_write is called.

Now, unix domain sockets are a bit specific in way that besides the usual data (which can be normally handled by those read and write calls), they can also transport the ancillary data (like the file descriptors). And this is something that the generic read/write simply can't handle, we have to use the more specific recvmsg/sendmsg here.

What we do now in this case where we need to send data+ancillary data is that we read/write the buffer for data as usual. For the ancillary data (the file descriptors passed through unix domain socket), we need to do call the extra recvmsg/sendmsg.

To avoid reading the data and ancillary data part separately, we'd need to enhance the buffer interface so it could accept the ancillary data too (...right now, we make use of FD passing capability only). Practically, this means the functions like buffer_add and buffer_write/buffer_read would need to accept extra arg for passing the ancillary data and if this is detected inside the buffer code, it would switch from using read/writecalls inside to more specificreacvmsg/sendmsg` calls (or any other specific functions for other types than unix domain socket which could possibly accept such extra data...).

The change would need to happen inside buffer code, not the _chan_buf_send. The _chan_buf_send is a wrapper/abstraction on top already (using buffers underneath and other related handling) and it already accepts struct worker_data_spec *data_spec which does have the fields to pass the extra/ancillary data (e.g. the data_spec->ext.socked.fd_pass). So this already works fine up to this point. Then, however, since the buffer doesn't have the interface/capability to accept ancillary data, we need to do 2 separate calls - one for data and one for the extra bit.

However, such change (adding support for ancillary data in buffer code) is questionable, because right now it's just that FD passing through unix domain sockets where we need this 'extra data passing` functionality. So it's questionable whether it would be worth the change:

prajnoha commented 3 years ago

Would you please elaborate on the comment at:

https://github.com/sid-project/sid/blob/cf867f83c90a46aab3f9434896ff818325d9fbac/src/resource/worker-control.c#L910

...yeah. Currently, if you look at how we handle the data reception, we actually do (example with comms between main process and worker):

The advantage here is that the event loop can still process another parts in between this "data reception callback" call, calling another callbacks for different other registered events.

Now, we don't do this for sending out the data - we have our own inner loops that try to send the data and they're looping up until the data is completely send (...or erroring out if there's an error) - the actual loop is inside the buffer_write_all call:

https://github.com/sid-project/sid/blob/cf867f83c90a46aab3f9434896ff818325d9fbac/src/resource/worker-control.c#L932-L935

https://github.com/sid-project/sid/blob/cf867f83c90a46aab3f9434896ff818325d9fbac/src/base/buffer.c#L184-L197

This means that if this "inner" loop would take longer for whatever reason, we'd be stalling all the other events registered with the "official" event loop. Besides that, this is also about having a clean solution where if we go through the event loop during data reception, we should do the same for data sending too so the solution for handling data transmission is symmetrical and not creating "exceptions" where one direction is handled one way and the other one is handled using a different way. This can be a source of wrong assumptions and related bugs in the future...