jgyates / genmon

Generac (and other models) Generator Monitoring using a Raspberry Pi and WiFi
GNU General Public License v2.0
379 stars 76 forks source link

Everything working perfectly ... until the Generator Starts - then lose communication to generator #984

Closed eBoon123 closed 10 months ago

eBoon123 commented 10 months ago

Expected Behavior

Genmon and generator continue to communicate after generator starts

Actual Behavior

As soon as generator starts, Genmon web app says it has lost communication with generator - unable to see status updates. Monitor page shows time-outs

Steps to Reproduce (including precondition)

start generator

Screenshot or Pictures relating to the problem (if possible)

{Please write here}

Your Environment

This is a brand new install - but I repeated my (successful) steps from my install on a Generac G0072280 18kw. I have followed every debug step I could find in the logs.

When installed, everything working perfectly:

As soon as the generator is started (we used the manual start from the Maintenance page) - no more communication between genmon and the monitor (mostly timeouts on the monitor page). Must manually stop the generator.

I have completed the installation twice and used Different Pi (Pi 3B+), SD CARD, RS232, everything. Only wiring harness remains constant.

I have just submitted logs and registers via the about page.

eBoon123 commented 10 months ago

Some Pics:

IMG_7149 IMG_7148 IMG_7147 IMG_7146

eBoon123 commented 10 months ago

Here is a picture of the monitor right after generator start - everything is frozen since packets have stopped:

IMG_7138

jgyates commented 10 months ago

Since you are having timeout errors that occur when the generator is starting or running, my first guess would be you have a marginal connection in your cableing that is effected by the vibrations of the generator. I don't know where the potential weak point in your connection would be but based off the picture you may want to look at the 9 pin connector. I notice that is is not screwed into the mating connector on the green screw terminal. Can you try holding this tightly together while starting and see if that corrects your issue?

I have seen issues like this before and it is usually a bad connection in the cable.

jgyates commented 10 months ago

I have not received your logs. You outbound email needs to be setup and working to submit your logs via the About page.

eBoon123 commented 10 months ago

I have not received your logs. You outbound email needs to be setup and working to submit your logs via the About page.

Just sent again after setting up outbound email

jgyates commented 10 months ago

I got your logs, it looks as if once the serial comms stopped working they never worked again, is that correct? You don't have any errors in your serial device log (/var/log/myserial.log) which is good. All of your errors appear to be timeout errors. Both of these indicate that genmon is transmitting but getting nothing back. This points to either an issue with your cable or your RS-232 converter board / cable or the connection to the controller.

You can try the loopback test described here:

https://github.com/jgyates/genmon/wiki/3.6---Serial-Troubleshooting

Also, since you are running bookworm you will need to activate and deactivate the environment before running the loopback test. See this page for details on that: https://github.com/jgyates/genmon/wiki/Appendix-S---Working-in-a-Managed-Python-Environment

Since you can perform a loopback at the pi header, the RS-232 board or at the end of the cable this will allow you to test each section of the communication path.

eBoon123 commented 10 months ago

Actualy running bullseye:

$ cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" NAME="Debian GNU/Linux" VERSION_ID="11" VERSION="11 (bullseye)" VERSION_CODENAME=bullseye ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" I have run the loopback from the rs232 and the end of the cable - all good. i know thats not the issue - genmon will run for hours without a single timeout or missed packet ... until generator start.

Yes - once the comms stop they do not come back.

Thank you

jgyates commented 10 months ago

Then I have not received your log files.

jgyates commented 10 months ago

apparently the logs I received are from another new install (install date was yesterday) that does not have their cable connected because the logs shows timeout errors in the logs is received.

jgyates commented 10 months ago

OK. I have them now.

jgyates commented 10 months ago

do they every come back if so when do they come back? Having your comms stop in a very repeatable way indicates one type of failure, having them stop once and never start back is another type of behavior.

eBoon123 commented 10 months ago

do they every come back if so when do they come back? Having your comms stop in a very repeatable way indicates one type of failure, having them stop once and never start back is another type of behavior.

To be honest, I'm not sure. i pulled everything after we manually shut it down. I can re-attach and see what happens.

thanks

eBoon123 commented 10 months ago

do they every come back if so when do they come back? Having your comms stop in a very repeatable way indicates one type of failure, having them stop once and never start back is another type of behavior.

To be honest, I'm not sure. i pulled everything after we manually shut it down. I can re-attach and see what happens.

thanks

OK - reattached to generator and here's what I have to report:

  1. Genmon comes up and everything working perfectly - no timeouts, or missed packets. I took this opportunity to pull and tug on the signal wires from the controller, the connection to the original plug and the rs232 break-out as I watched the Monitor screen. I could not induce any timeouts or missed packets. I think this would verify that the problem is likely NOT loose signal wires - the force of me pulling, tugging and twisting would be way beyond anything a a running engine would do
  2. Started the generator from the Maintenance page. The red exclamation mark (!) came up at the bottom of the page telling me communication lost. Unable to stop the generator from the Maintenance page and had to manually shut the generator off.
  3. Timeouts continue to go up. The M: in packet count continues to increase, but the S: in packet count does not change

Hopefully that helps. I can send logs and/or registers if you need them.

Thank you again for your help!!!

jgyates commented 10 months ago

yes, please send the logs again.

jgyates commented 10 months ago

another thing to try is this:

get genmon to work again. then log into ssh, start the generator, then restart genmon on the console with this command:

   cd genmon
   ./startgenmon.sh restart

then wait about 30 seconds and refresh your browser. Does genmon come back and start communicating.

Trying to determine if a power cycle is required to get it working again or just restarting the software

Are you running any plugins? if yes, try reproducing it without the plugins active. This will reduce the number of variables.

eBoon123 commented 10 months ago

another thing to try is this:

get genmon to work again. then log into ssh, start the generator, then restart genmon on the console with this command:

   cd genmon
   ./startgenmon.sh restart

then wait about 30 seconds and refresh your browser. Does genmon come back and start communicating.

Trying to determine if a power cycle is required to get it working again or just restarting the software

Are you running any plugins? if yes, try reproducing it without the plugins active. This will reduce the number of variables.

Logs sent.

No Plugins.

I had left genmon in non-working state since my last test. After the restart of genmon (./startgenmon.sh restart), still NOT working - 100% timeouts. I'm sure if I were to reboot the Pi, it will come back up and be running fine (waiting to do that until I get more feedback from you). I know that's not quite the test you asked for, but I am remotely supporting this generator and need to coordinate with the homeowner if I am going to remote start the generator, since they may have to back outside and manually turn it off)

jgyates commented 10 months ago

yes, just pull the power from the pi wait 5 seconds, then plug it back in. Let me know if that brings it back to life.

Question, how are you powering the pi?

eBoon123 commented 10 months ago

yes, just pull the power from the pi wait 5 seconds, then plug it back in. Let me know if that brings it back to life.

Question, how are you powering the pi?

Battery. Using the adapter from the parts list in the Wiki (https://www.amazon.com/gp/product/B01DYE54LI/ref=oh_aui_detailpage_o00_s00?ie=UTF8&th=1

eBoon123 commented 10 months ago

yes, just pull the power from the pi wait 5 seconds, then plug it back in. Let me know if that brings it back to life.

sudo shutdown now - did not fix

pulled the plug - did fix. Genmon happily chugging along.

jgyates commented 10 months ago

Does the RS-232 board have LEDs on it that flash when transmitting or receiving? If it does do they still flash when it is not working.

also, we could try this:

Get it to fail again, then unplug the connections to the RS-232 board (while the pi is powered), wait 15 seconds or so, then plug it back in to see if this will get it working again. If this corrects the issue it would isolate the problem to the RS-232 board. If you do this make sure you plug everything as it was before (e.g. don't mix up power and ground when plugging back in). When you plug it back in you can either plug all four wires in at the same time, or plug them in separately. If you do it separate, plug in ground first, then TX an RX, then power last.

eBoon123 commented 10 months ago

No flashing LEDs, just a single LED that stays lit.

I will try the other suggestion tomorrow and report back.

Thank you again for your help!

jgyates commented 10 months ago

My current best guess is that you may have a dip in the battery voltage when the generator starts and the DC to DC converter in the power supply is dipping the 5V output. The pi has capacitors that guard against this but the capacitors on the RS232 board may be bad. Do the test above first. After that you can try to put the power on the RS232 board on 3.3V in stead of 5V. Most of the boards do this and it may be more resilient to a dip in the supply voltage.

eBoon123 commented 10 months ago

My current best guess is that you may have a dip in the battery voltage when the generator starts and the DC to DC converter in the power supply is dipping the 5V output. The pi has capacitors that guard against this but the capacitors on the RS232 board may be bad. Do the test above first. After that you can try to put the power on the RS232 board on 3.3V in stead of 5V. Most of the boards do this and it may be more resilient to a dip in the supply voltage.

Will give this a try in the morning. For testing purposes, could I power the Pi/RS232 from a battery bank to separate it from the battery in the generator - or would that be a problem with separate grounds?

One other question, is it more/less reliable to power the Pi from the controller instead of the battery?

Thank you

jgyates commented 10 months ago

That would be a problem with the grounds. You could power from the molex connector as an alternative. More info on that is on this page: https://github.com/jgyates/genmon/wiki/3.1--Making-a-Cable

jgyates commented 10 months ago

If you have a Evolution 2 you have more power options on the connector. If you are just using a pi and don't have any USB peripherals it may be fine to use the molex. Battery is generally better in my view, but molex can work in many cases. Do you have any battery warning message in your generator log?

eBoon123 commented 10 months ago

Some good news to report. I moved to a 3.3V pin to power the RS232 board and was able to start the generator and not lose the connection! Was able to start and STOP the generator from the Web Page! I want to do some additional testing tomorrow to make sure that I can do this a few times to make sure it's now stable, but looks very promising.

This validates your theory that it was the RS232 board was getting messed up! YAY!

Thank you for all of your help and ideas to troubleshoot this issue. Genmon is awesome and so are you!

jgyates commented 10 months ago

That is great, I am glad it is working for you. Let me know how your testing goes. I am going to close this thread but please report any follow up status or questions to this thread.

eBoon123 commented 9 months ago

Unfortunately, even after powering the RS232 board to the 3.3V pin on the pi - we are back to a timeout after the generator starts. I have replaced the RS232 board with a new one (exact same model) and am seeing the same symptoms - works 100% right up until the generator starts. Is powering the pi (I have a Pi 3B+) directly from the controller board a better power source than the battery? I am using https://www.amazon.com/gp/product/B00LPK0Z9A/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1 this for the RS232 board.

One other important note is that I am occasionally seeing the Pi undervoltage set to yes - not always and not in this latest failure that I am looking at today.

image

Any thoughts on what I should try next? I have a raspberry Pi Zero laying around somewhere - is that a better (lower power draw) option?

Thank you again for your help!!

jgyates commented 9 months ago

You could try powering from the battery, or you could try the pi zero. I guess which ever one is easier for you to try. I have seen a couple of people who had power problems with powering from the controller. The fact that you periodically see undervoltage errors is not a good sign. You should never see that and if you do that means that your power supply is inadequate. Your power supply is performing in the marginal to inadequate range so this could be causing your errors as the RS232 board reliies on a stable power source to maintain the ~12V for RS-232 comms via charge pump capacitors.

If the pi zero does not solve the issue then you should change your power supply to the battery.