Open bobveringa opened 1 month ago
I'm not sure I understand the scenario. Does this occur when connectivity is lost? If so, I'd suggest the following:
OSError
may be thrown at many points in the chain, including in the TLS protocol. That is why, in the published code, the respnse to an OSError
is to down the connection and reconnect.The error occurs when explicitly disconnecting from the broker using client.disconnect()
. Which is why I started to investigate it more in depth. I wouldn't expect these types of errors in the case of controlled disconnect from the broker.
But it may very well be normal. If these errors are expected, maybe a comment could be added that this is one of the errors?
Now I understand. I don't know the answer, however. I have limited experience of TLS.
@bobveringa I have pushed a minor change to the read_allocate
branch - this was a PR providing a fix to iss 132. It seems that some brokers briefly emit a password fail error prior to a successful connection.
On the big picture how is your testing progressing? Do you think the time is right to merge read_allocate
- or should I wait?
Testing has been going well. I am pretty confident in the performance and stability of this solution, as I've manged to solve most of the weird problems we have been experiencing (they were related to some weird memory problem, I was unable to find the root cause of this, which is still bugging me).
Unfortunately, I've not been able to roll out the solution to our fleet of internal test sensors for testing. This testing has been delayed a couple of times due to other projects. Hope I have time to do a large-scale test within the next couple of weeks. But these things take more time than I can allocate at the moment.
To summarize. I feel confident that read_allocate
is stable, but I don't have the data to back this up yet. For me, it is 50/50 to merge this, but I don't know how much testing of your own you have done.
Thanks for the feedback. I'll wait until your large scale test is complete.
Just wanted to provide a quick update.
I just started with a large-scale test and the intermediate results are looking promising. We have some tests that are much more performant with pre-allocated buffers, and tests also show that we can deal with larger messages than before.
The test is executed with the following settings:
IBUFSIZE = 5600
MSG_BYTES = False
Excellent, thanks for the feedback.
After some more extensive testing, we have not run into any issues with the pre-allocated buffers. We think that in our case, performance has significantly improved in certain situations (such as attempting to stream files over MQTT).
I feel fairly confident that this code is stable and can be merged. It should be noted that we set the buffer to a large size ahead of time, so not sure how this would affect people who sometimes receive large buffers.
Thank you for the confirmation. I'll document IBUFSIZE
and MSG_BYTES
and merge it.
not sure how this would affect people who sometimes receive large buffers.
I will document how the mechanism works, making the point that setting a large buffer size is an insurance policy against a large allocation at run time (when RAM might be fragmented).
I suspect most applications use short messages, and my testing has concentrated on this usage. I've never attempted to stream files over MQTT. You might want to add a short doc explaining what you've learned about doing this.
Now merged, with docs.
While investigating some issue, I added some more debug prints to most error handlers.
This results in the following output on disconnect.
Like pretty much all MBEDTLS errors, it is not clear what is happening. But nevertheless, errors like this cannot be good. Do you have any idea what is going on here? I'm happy to attempt to investigate this error and see if there is a solution.