nutanix / libvfio-user

framework for emulating devices in userspace
BSD 3-Clause "New" or "Revised" License
166 stars 51 forks source link

spec issue: potential deadlocks in concurrent messages and replies #466

Open jlevon opened 3 years ago

jlevon commented 3 years ago

From Stefan:

> Are there rules for avoiding deadlock between client->server and
> server->client messages? For example, the client sends
> VFIO_USER_REGION_WRITE and the server sends VFIO_USER_VM_INTERRUPT
> before replying to the write message.
> 
> Multi-threaded clients and servers could end up deadlocking if messages
> are processed while polling threads handle other device activity (e.g.
> I/O requests that cause DMA messages).
> 
> Pipelining has the nice effect that the oldest message must complete
> before the next pipelined message starts. It imposes a maximum issue
> depth of 1. Still, it seems like it would be relatively easy to hit
> re-entrancy or deadlock issues since both the client and the server can
> initiate messages and may need to wait for a response.
tmakatos commented 3 years ago

Are there rules for avoiding deadlock between client->server and server->client messages? For example, the client sends VFIO_USER_REGION_WRITE and the server sends VFIO_USER_VM_INTERRUPT before replying to the write message.

In this case, neither the client nor the server can assume that after they've sent a request the only thing they expect to receive is the response. They should examine the message header and act accordingly.

Multi-threaded clients and servers could end up deadlocking if messages are processed while polling threads handle other device activity (e.g. I/O requests that cause DMA messages).

Pipelining has the nice effect that the oldest message must complete before the next pipelined message starts. It imposes a maximum issue depth of 1. Still, it seems like it would be relatively easy to hit re-entrancy or deadlock issues since both the client and the server can initiate messages and may need to wait for a response.

My understanding is that this isn't any different from physical hardware. In vfio-user, messages have no dependencies, so a deadlock can only occur when the device is used improperly. @jlevon do you agree?

tmakatos commented 3 years ago

@jlevon ping

jlevon commented 3 years ago

I'm struggling to understand how this could be a spec issue at all. It seems like an entirely an implementation-level thing. I'm thinking this is maybe WONTFIX ?

tmakatos commented 3 years ago

I agree with you.