NordicPlayground / nRF51-ble-bcast-mesh

Other
323 stars 121 forks source link

m_state.queue_saturation stuck at true #78

Open victorpasse opened 8 years ago

victorpasse commented 8 years ago

Hi, I have changed CACHE_TASK_FIFO_SIZE to 64 and have ASYNC_EVENT_FIFO_QUEUE_SIZE at 8 When I change many handles at once m_state.queue_saturation get stuck to true and never goes to false. I tries increasing ASYNC_EVENT_FIFO_QUEUE_SIZE to 32 and this helped. What is the required proportion between all chaches? There sould be a compile time check if this is the problem

trond-snekvik commented 8 years ago

Hi Victor, another good question.

Increasing the size of the async event queue only really mitigates the problem, as it allows you to avoid the queue saturation. The queue saturation functionality is really an optimalization, and assumes that at least one of the queued events is a packet. This assumption used to hold with good margins until the cache tasks were introduced, but the independently sized cache task fifo really highlights this design flaw.

This issue needs to be fixed properly, as it is almost impossible to estimate the right balance. The only way to be truly safe is to keep the async-event fifo at the same length as the cache task fifo, or disable the queue saturation in the transport_control module (never set it to true). I hope I can push a proper fix for this and other issues reported since the last release before the end of the month.

victorpasse commented 8 years ago

Thanks for the information, Will disabling the queue saturation result in other drawbacks?

trond-snekvik commented 8 years ago

You might see the async queue filling up faster, as the radio won't back off to allow other processing tasks when it detects that the queue is full. This may cause some performance issues in the version handler.

I think the "correct" fix for this problem is to stop queuing each cache task in the async-queue. This is only really done to get a unified context, and any duplicate cache events in the async queue are a waste of valuable queue space. To fix this issue and your issue #74, I've pushed a commit changing the cache_task_push() function to something a little safer and more conservative. I believe this should relieve your async queue significantly. While the queue saturation functionality still needs to be adressed separately in the next relase, I believe this should bring the previously mentioned assumption back into reasonable-land. Please report back with results :)