azteca1998 / masterpower-api

1 stars 1 forks source link

program hanging on buf_read #2

Closed wolffshots closed 1 year ago

wolffshots commented 1 year ago

Hi!

I know this codebase isn't in active development right now so I'm sorry if I'm causing problems

I've been working on implementing this for my Phocos inverters and have had some success, it was logging data (with my QPGS command) for a few hours and I could see it in my HA but then it crashed with Error: StdError(RecvError(())) and now it seems to get stuck at this line on the first command it sends every time that I run it

Here's an example of when I run DeviceModeInquiry first

 DEBUG masterpower_api::codec > Encoding command DeviceModeInquiry.
 TRACE masterpower_api::codec > Command payload (DeviceModeInquiry): ().
 TRACE masterpower_api::codec > Encoded command (DeviceModeInquiry): [81, 77, 79, 68, 73, 193, 13]
 TRACE masterpower_api::inverter > Writing command to stream
 TRACE masterpower_api::inverter > Command written successfully
 TRACE masterpower_api::inverter > Buffer contents before first read: b""
 TRACE masterpower_api::inverter > Buffer reserved

Have you encountered anything like this or do you have any idea what would cause it?

wolffshots commented 1 year ago

I left it for while and it seemed to get through a few commands and then get stuck again

lluiscab commented 1 year ago

Hi, I did see something similar happen with my inverter and I'm starting to think that the serial implementation on whatever firmware this inverters run isn't very good.

Last summer I experienced lockups I'd say every hour or so and I ended up using an automation on HA to restart the Pi Zero where MPQTT was running if no data was sent in the last 5 minutes. It seems like the serial port on the inverter will sometimes stop responding after a certain amount of commands sent and closing and reopening the port sometimes fixed it. Now, this year It's been running without a problem for two straight weeks now with the same code, so I'm not really sure what changed, I think my inverter must have restarted at some point in the winter and that "resolved" the issue?

Honestly, I'm a bit lost here but given that my setup currently works I'm actually a bit scared of restarting the Pi Zero or doing anything to it in case the inverter goes back to whatever was happening last year.

azteca1998 commented 1 year ago

I suspect it's either the inverter's fault (maybe the firmware is not prepared for a continuous connection? or maybe a proprietary "protection" mechanism?), wire interference (doesn't make much sense to me, if the crash occurs after a similar amount of time every time), or that we're missing something from the protocol (which is even less likely since we're using a standard serial connection with the protocol as explained in the inverter's documentation).

Maybe you could try updating the inverter's firmware? (no idea if there are updates) If you manage to find the cause or any more info, please do share it with us.

wolffshots commented 1 year ago

I managed to get it working again somewhat now, this is what I did:

  1. I opened the port in another serial reader thing that I wrote for working with ESPs and other MCUs as practice in Rust
  2. started MPQTT which errored with Device or resource busy as you would expect
  3. I stopped my other serial reader
  4. started MPQTT again and it worked flawlessly

I've added a stats sensor in the meantime so it can report how long each update takes so I can see if there's a pattern or any outliers and I'll add more vars to the stats thing to get an idea of MPQTT's health and performance on my setup

An automation to restart it if no data is sent in a certain window is a great idea, unfortunately restarting the pi didn't actually solve the problem but I can probably set it up as an extra precaution.

I think I'm gonna rework the loop structure a bit to have a high priority loop (every 10 seconds or something) and a low priority loop (every 10 minutes or something) to reduce the number of commands being sent (especially things I don't need to track that badly like QPIRI and QPIWS). I'll check about firmware versions and see if there's a changelog or something so I can see if it's a know issue.

I'm leaning towards it just being a dodgy serial implementation so I may just have to work around it rather than solve it but if I discover more you two will be the first to know.

I'll close this issue for now and reopen it if I get the issue again or have some more info