openenergymonitor / emonhub

Python service linking and decoding input to MQTT & Emoncms
90 stars 83 forks source link

VEDirect Error over new emoncms installation #160

Open maumauns opened 3 years ago

maumauns commented 3 years ago

Hi all, I need help with these error on vedirect. All was working good until I updated to emonSD-08May21 I’m getting checksum errors, but it doenst happen with older version of emonSD Any help??

``Traceback (most recent call last): File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 135, in _read_serial packet = self.input(self._ser.read()) File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 61, in input if self.key.decode() == 'Checksum': UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 1477: invalid start byte 2021-09-05 05:04:33,570 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB0', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-09-05 05:04:33,573 ERROR VEDirect Read error Traceback (most recent call last): File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 135, in _read_serial packet = self.input(self._ser.read()) File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 61, in input if self.key.decode() == 'Checksum': UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 1477: invalid start byte 2021-09-05 05:04:38,597 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB0', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-09-05 05:04:38,599 ERROR VEDirect Read error Traceback (most recent call last): File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 135, in _read_serial packet = self.input(self._ser.read()) File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 61, in input if self.key.decode() == 'Checksum': UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 1477: invalid start byte 2021-09-05 05:04:43,687 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB0', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-09-05 05:04:43,689 ERROR VEDirect Read error Traceback (most recent call last): File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 135, in _read_serial packet = self.input(self._ser.read()) File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 61, in input if self.key.decode() == 'Checksum': UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 1477: invalid start byte

LOG LEVEL: DEBUG #(DEFAULT:WARNING) ``

eroyee commented 3 years ago

Also reporting a similar error here.

In my case the VEDirect data is collected and processed OK for a minute or two, then it fails per this log extract:

2021-11-13 11:05:47,764 INFO MainThread EmonHub emonHub (emon-pi variant) v2.2.6 2021-11-13 11:05:47,764 INFO MainThread Opening hub... 2021-11-13 11:05:47,764 WARNING MainThread Setting emoncmsorg apikey: obscured 2021-11-13 11:05:57,827 ERROR VEDirect Invalid checksum, discarding data 2021-11-13 11:09:08,467 ERROR VEDirect Invalid checksum, discarding data 2021-11-13 11:09:48,586 ERROR VEDirect Read error Traceback (most recent call last): File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 135, in _read_serial packet = self.input(self._ser.read()) File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 61, in input if self.key.decode() == 'Checksum': UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 3: invalid start byte 2021-11-13 11:09:58,627 ERROR VEDirect Read error

etc etc...

Presently this is running on an i7/Ubuntu 18.04 (fully up-to-date), and the emonhub installation is as of yesterday.

I've tried three different Victron devices, and two different cables direct to the PC, it may be a little worse if I insert a USB hub but I've not really looked into that in depth. I have zero read issues using direct serial to emonhub from an Arduino.

HTH, happy to supply further details or trial anything as needed. If I happen to run across a solution I'll follow up.

eroyee commented 3 years ago

Updated information with debug log. Not sure it helps much but this shows the last correct frame, followed by the failure (which remains at issue thereafter):

2021-11-13 12:09:38,592 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB2', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-11-13 12:09:38,594 DEBUG VEDirect 61 NEW FRAME : {'PID': '0xA057', 'FW': '154', 'SER#': 'HQ2035I1QX1', 'V': '13540', 'I': '10400', 'VPV': '35490', 'PPV': '144', 'CS': '3', 'MPPT': '2', 'OR': '0x00000000', 'ERR': '0', 'LOAD': 'OFF', 'H19': '15150', 'H20': '47', 'H21': '450', 'H\x0022': '169', 'H23': '593', 'HSDS': '195'} 2021-11-13 12:09:38,594 DEBUG VEDirect 61 Timestamp : 1636758578.594421 2021-11-13 12:09:38,594 DEBUG VEDirect 61 From Node : 9 2021-11-13 12:09:38,594 DEBUG VEDirect 61 Values : [13.540000000000001, 35.49, 144, 10.4] 2021-11-13 12:09:38,594 DEBUG VEDirect 61 Sent to channel(start)' : ToEmonCMS 2021-11-13 12:09:38,594 DEBUG VEDirect 61 Sent to channel(end)' : ToEmonCMS 2021-11-13 12:09:38,727 DEBUG MQTT Publishing: emon/SmartSolarMPPT/V 13.540000000000001 2021-11-13 12:09:38,728 DEBUG MQTT Publishing: emon/SmartSolarMPPT/VPV 35.49 2021-11-13 12:09:38,728 DEBUG MQTT Publishing: emon/SmartSolarMPPT/PPV 144 2021-11-13 12:09:38,729 DEBUG MQTT Publishing: emon/SmartSolarMPPT/I 10.4 2021-11-13 12:09:38,729 INFO MQTT Publishing 'node' formatted msg 2021-11-13 12:09:38,729 DEBUG MQTT Publishing: emonhub/rx/9/values 13.540000000000001,35.49,144,10.4 2021-11-13 12:09:48,395 DEBUG emoncmsorg Buffer size: 3 2021-11-13 12:09:48,614 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB2', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-11-13 12:09:48,618 ERROR VEDirect Read error Traceback (most recent call last): File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 135, in _read_serial packet = self.input(self._ser.read()) File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 1: invalid start byte 2021-11-13 12:09:58,638 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB2', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-11-13 12:09:58,639 ERROR VEDirect Read error Traceback (most recent call last): File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 135, in _read_serial packet = self.input(self._ser.read()) File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 1: invalid start byte

eroyee commented 3 years ago

Alerted by @maumauns comment re previous version I reverted the 23rd Sept changes to EmonHubVEDirectInterfacer.py from @bwduncan (https://github.com/openenergymonitor/emonhub/commit/93a451960165b6e7bc28c6482b57d1184f1bc477#diff-4bcb5856739a061946d2eaaa9e336658ef55b4b0ce8e83b8e4363b95d32c3e6b).

This version still logs the decode error after a period of time, but it appears to (eventually) sort itself and carry on. Here's a log excerpt, note it was working fine for approx 14mins prior to this, and subsequently for a further 20mins (~as of time of writing - it continues ok right now):

2021-11-13 13:34:31,163 ERROR VEDirect 'utf-8' codec can't decode byte 0x86 in position 0: invalid start byte 2021-11-13 13:34:41,206 ERROR VEDirect 'utf-8' codec can't decode byte 0xb3 in position 0: invalid start byte 2021-11-13 13:34:51,254 ERROR VEDirect 'utf-8' codec can't decode byte 0x97 in position 0: invalid start byte 2021-11-13 13:35:01,280 ERROR VEDirect 'utf-8' codec can't decode byte 0x86 in position 0: invalid start byte 2021-11-13 13:35:11,305 ERROR VEDirect 'utf-8' codec can't decode byte 0xb1 in position 0: invalid start byte 2021-11-13 13:35:21,331 ERROR VEDirect 'utf-8' codec can't decode byte 0xbe in position 0: invalid start byte 2021-11-13 13:35:31,357 ERROR VEDirect 'utf-8' codec can't decode byte 0xc8 in position 0: unexpected end of data 2021-11-13 13:35:41,378 ERROR VEDirect 'utf-8' codec can't decode byte 0xca in position 0: unexpected end of data 2021-11-13 13:35:51,413 ERROR VEDirect 'utf-8' codec can't decode byte 0x8f in position 0: invalid start byte 2021-11-13 13:36:01,463 ERROR VEDirect 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte 2021-11-13 13:36:11,502 ERROR VEDirect 'utf-8' codec can't decode byte 0xc3 in position 0: unexpected end of data 2021-11-13 13:36:21,529 ERROR VEDirect 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte 2021-11-13 13:36:31,558 ERROR VEDirect 'utf-8' codec can't decode byte 0x8b in position 0: invalid start byte 2021-11-13 13:36:41,583 ERROR VEDirect 'utf-8' codec can't decode byte 0x81 in position 0: invalid start byte 2021-11-13 13:36:51,616 ERROR VEDirect 'utf-8' codec can't decode byte 0x84 in position 0: invalid start byte

UPDATE: Worked ok for about 40mins, then failed again with the same error and has continued to fail for > 10mins. Restarting emonhub often fixes it, but not this time. Here is an example of present serial output from the MPPT:

[14:48:54:340] Checksum⇥ <0xa9>␍␊ [14:48:55:150] PID⇥ 0xA057␍␊ [14:48:55:150] FW⇥ 154␍␊ [14:48:55:150] SER#⇥ HQ2035I1QX1␍␊ [14:48:55:166] V⇥ 13490␍␊ [14:48:55:166] I⇥ 6600␍␊ [14:48:55:166] VPV⇥ 34990␍␊ [14:48:55:182] PPV⇥ 92␍␊ [14:48:55:182] CS⇥ 4␍␊ [14:48:55:182] MPPT⇥ 2␍␊ [14:48:55:198] OR⇥ 0x00000000␍␊ [14:48:55:198] ERR⇥ 0␍␊ [14:48:55:198] LOAD⇥ OFF␍␊ [14:48:55:214] H19⇥ 15195␍␊ [14:48:55:214] H20⇥ 92␍␊ [14:48:55:214] H21⇥ 569␍␊ [14:48:55:214] H22⇥ 169␍␊ [14:48:55:230] H23⇥ 593␍␊ [14:48:55:230] HSDS⇥ 195␍

eroyee commented 3 years ago

Today, as part of my fault-finding process, I tried running another python-based vedirect decoder from @karioja (https://github.com/karioja/vedirect).

This code will just print to std out, or send to MQTT, presently it's printing to std out and has done for quite some time - longer at least than the emonhub VEDirect interfacer has on this same machine.

This prompted me to see if I could modify the VEDirect interfacer code to simply print to the console, and from there perhaps help identify the source of the problem. However I'm an old guy so the object style coding and association with EmonHubInterfacer has defeated me for the moment. I'm sure it'd not be difficult to isolate & run it standalone, and if someone was able to assist with that I'd be keen to hear from you.

At this time, bearing in mind my limited understanding of the code (and limited ability!) my feeling is that there is possibly some corruption of incoming data, or failure to properly clear a buffer that eventually presents the issue. Whatever it is doesn't necessarily reset immediately upon restarting emonhub or re-plugging the USB port, although it often will.

Finally, it remains possible there's some other process running on this machine that's causing the problem, and I suppose it may not even have been this particular section of emonhub at issue (although it seems the logical place to start), however:

-karioja's code runs fine with the Victron MPPT -emonhub direct serial operates flawlessly over a decent number of hours (via same USB port as the MPPT)

So I think it worth concentrating here for the moment...

bwduncan commented 3 years ago

Hi,

Unfortunately I don't have one of these devices so I can't test the code. I'm sorry that it appears my changes to support the transition away from python 2 have broken this interfacer. I will do my best to help clean this up.

However, it is possible that the data was always wrong and we are now seeing errors instead of data just being silently dropped. For example, in the first traceback posted by @maumauns the decoding of the key fails at byte 1477. Do we believe that a >1.44kb message is valid data? A normal message is 78 bytes...

I think one obvious explanation is that the data is not UTF-8. It looks like most of it is: It's simple human-readable text, separated by tabs and newlines. However, the checksum byte is not. It's a computed value so if it's happens to be >127 then that's not valid UTF-8, and the code will fail. That's why the code doesn't trying to decode the "Checksum" value, but if the data is corrupted, it might try to decode a "CHecksum" value, for example... This seems unlikely, though.

We need to get more information about what's happening. If you are comfortable in python, please can you catch the UnicodeDecodeError exception and get the values of the local variables? You could do this by replacing line 135, in _read_serial, so that this line:

                packet = self.input(self._ser.read())

becomes this

                try:
                    packet = self.input(self._ser.read())
                except UnicodeDecodeError:
                    import traceback, sys
                    frame = sys.exc_info()[2]
                    formattedTb = traceback.format_tb(frame)
                    frame = frame.tb_next
                    while frame:
                        self._log.debug("%s\t%s", formattedTb.pop(0), frame.tb_frame.f_locals)
                        frame = frame.tb_next

You'll need to have debug logging turned on, but it looks like you already do. This will generate a lot of output.

FWIW, I have successfully used the code in emonhub as it is now to decode the example dump file in https://github.com/karioja/vedirect/blob/master/test/bluesolar_1.23.dump 8 messages have checksum errors, the rest display data.

Bruce

bwduncan commented 3 years ago

Also, can you confirm whether this is running under python 2 or 3?

eroyee commented 3 years ago

Hello Bruce, thanks for responding.

I've altered the code as you request and have turned back on debug logging. It's just come up with an Invalid checksum error, but the Unicode failure has yet to occur.

Prior to this I'd done some further messing around with various parameters, and a little in the code itself. Because of its intermittent nature I can't be completely sure but I have an idea the 'poll interval' may have an effect. When I originally came across the issue I had that set to '10', subsequently I increased it to 15 and while it still gave errors these didn't seem to appear as often, and it didn't crash completely with the Unicode issue.

Interesting you get the checksum errors from that dump file. I ran karioja's code, publishing to MQTT, for several hours yesterday and it didn't have any issue, which led me to trial a couple of things in your code (which I'd describe as force clearing the packet and buffer), but this didn't appear to have any effect.

Re your other question - on my system python defaults to 2.7.17, do you think it worth forcing python 3 (3.6.9 for me)?

eroyee commented 3 years ago

@bwduncan it's just crashed, here's a copy of the log with a preceding good frame:

2021-11-15 08:16:21,853 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB1', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-11-15 08:16:21,858 DEBUG VEDirect 66 NEW FRAME : {'PID': '0xA057', 'FW': '154', 'SER#': 'HQ2035\x00I1QX1', 'V': '13380', 'I': '7400', 'VPV': '36560', 'PPV': '101', 'CS': '3', 'MPPT': '2', 'OR': '0\x00x00000000', 'ERR': '0', 'LOAD': 'OFF', 'H19': '15326', 'H20': '8', 'H21': '100', 'H22': '97', 'H23': '500', 'HSDS': '197'} 2021-11-15 08:16:21,859 DEBUG VEDirect 66 Timestamp : 1636917381.858685 2021-11-15 08:16:21,859 DEBUG VEDirect 66 From Node : 9 2021-11-15 08:16:21,859 DEBUG VEDirect 66 Values : [13.38, 36.56, 101, 7.4] 2021-11-15 08:16:21,859 DEBUG VEDirect 66 Sent to channel(start)' : ToEmonCMS 2021-11-15 08:16:21,860 DEBUG VEDirect 66 Sent to channel(end)' : ToEmonCMS 2021-11-15 08:16:21,954 DEBUG MQTT Publishing: emon/SmartSolarMPPT/V 13.38 2021-11-15 08:16:21,955 DEBUG MQTT Publishing: emon/SmartSolarMPPT/VPV 36.56 2021-11-15 08:16:21,956 DEBUG MQTT Publishing: emon/SmartSolarMPPT/PPV 101 2021-11-15 08:16:21,956 DEBUG MQTT Publishing: emon/SmartSolarMPPT/I 7.4 2021-11-15 08:16:21,956 INFO MQTT Publishing 'node' formatted msg 2021-11-15 08:16:21,957 DEBUG MQTT Publishing: emonhub/rx/9/values 13.38,36.56,101,7.4 2021-11-15 08:16:21,968 DEBUG emoncmsorg Buffer size: 3 2021-11-15 08:16:31,881 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB1', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False) 2021-11-15 08:16:31,884 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,884 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,885 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,885 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,885 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,885 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,886 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,886 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,886 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,886 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,887 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'byte': b'\t', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139803383289600)>} 2021-11-15 08:16:31,887 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial

It carried on with that for several minutes, I had to kill the process to stop it, the usual 'restart' wouldn't cut it. I'll start it again now - btw I set the poll interval back to 10 for this test.

UPDATE: It's just done it again. Pretty much the same data so I won't post it up (although am happy to if you want). As it crashes completely with this I'll stop it for now but am more than happy to carry on with further alterations/tests as you need.

bwduncan commented 3 years ago

I think the code I showed has a bug. Please can you remove the first line which says frame = frame.tb_next and try again?

Yes I would suggest trying it with python3. It's even harder for me to test with python2, which was not supported past January the 1st, 2020. It should just be a simple case of replacing "python" with "python3" in the ExecStart= line in the emonhub.service file.

It looks like the exception handler is getting stuck in one frame, which I don't understand. I tested it in python3, so maybe that will help. If not, you could try putting the same debugging code around the line which was failing before. At least then we'll get the useful stack frame again and again!

The poll interval shouldn't really make any difference. All it does is prevent trying to read from the serial device for that length of time. Once that time is up, it will try to consume all the data. If it polls more frequently than the device delivers data, it will just report a "Read error".

eroyee commented 3 years ago

'k, thanks Bruce, I've done that now and will report back.

Got a couple of checksum errors within a minute of starting, it's yet to do anything further.

I see emonhub.py has the python3 shebang at the file start so it should be using python3? I tried calling python3 directly from the service file but it complained (yes I did reload the daemon). I'd set the system default to 3 except I have some legacy code that would be a pain to change...

Ok on the poll interval. I more or less saw what it was doing in the code but I thought I'd mention the possibility of it having an effect. It was something not done in karioja's code (which has been running again without error for the past ~three hours) and so I supposed there could have been some weird thing going on that this might have helped point to. Anyway it's not a clear division between ok/not ok, so it may well be a red herring.

On 15/11/21 12:24 pm, bwduncan wrote:

I think the code I showed has a bug. Please can you remove the first line which says |frame = frame.tb_next| and try again?

Yes I would suggest trying it with python3. It's even harder for me to test with python2, which was not supported past January the 1st, 2020. It should just be a simple case of replacing "python" with "python3" in the ExecStart= line in the emonhub.service file.

It looks like the exception handler is getting stuck in one frame, which I don't understand. I tested it in python3, so maybe that will help. If not, you could try putting the same debugging code around the line which was failing before. At least then we'll get the useful stack frame again and again!

The poll interval shouldn't really make any difference. All it does is prevent trying to read from the serial device for that length of time. Once that time is up, it will try to consume all the data. If it polls more frequently than the device delivers data, it will just report a "Read error".

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openenergymonitor/emonhub/issues/160#issuecomment-968384860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG6JE47OM3OPLNZLNIRGFS3UMBALRANCNFSM5DXB6GWA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

eroyee commented 3 years ago

@bwduncan I'd omitted to turn on debug logging again, sigh, here it goes :-)

2021-11-15 13:13:48,567 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB1', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, ds rdtr=False) 2021-11-15 13:13:48,572 DEBUG VEDirect 43 NEW FRAME : {'PID': '0xA057', 'FW': '154', 'SER#': 'HQ2035I1QX1', 'V': '13500', 'I': '12400', 'VPV': '38200', 'PPV': '171', 'CS': '5', 'MPPT': '1', 'OR': '0x00000000', 'ERR': ' 0', 'LOAD': 'OFF', 'H19': '15409', 'H20': '91', 'H21': '486', 'H22': '97', 'H23': '500', 'HSDS': '197'} 2021-11-15 13:13:48,572 DEBUG VEDirect 43 Timestamp : 1636935228.572169 2021-11-15 13:13:48,572 DEBUG VEDirect 43 From Node : 9 2021-11-15 13:13:48,573 DEBUG VEDirect 43 Values : [13.5, 38.2, 171, 12.4] 2021-11-15 13:13:48,573 DEBUG VEDirect 43 Sent to channel(start)' : ToEmonCMS 2021-11-15 13:13:48,573 DEBUG VEDirect 43 Sent to channel(end)' : ToEmonCMS 2021-11-15 13:13:48,734 DEBUG MQTT Publishing: emon/SmartSolarMPPT/V 13.5 2021-11-15 13:13:48,735 DEBUG MQTT Publishing: emon/SmartSolarMPPT/VPV 38.2 2021-11-15 13:13:48,735 DEBUG MQTT Publishing: emon/SmartSolarMPPT/PPV 171 2021-11-15 13:13:48,736 DEBUG MQTT Publishing: emon/SmartSolarMPPT/I 12.4 2021-11-15 13:13:48,736 INFO MQTT Publishing 'node' formatted msg 2021-11-15 13:13:48,736 DEBUG MQTT Publishing: emonhub/rx/9/values 13.5,38.2,171,12.4 2021-11-15 13:13:58,593 DEBUG VEDirect Starting Serial read from Serial(port='/dev/ttyUSB1', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, ds rdtr=False) 2021-11-15 13:13:58,598 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x 7f487be71b08>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139949275346688)>} 2021-11-15 13:13:58,599 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139949275346688)>} 2021-11-15 13:13:58,599 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x 7f487beb5c08>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139949275346688)>} 2021-11-15 13:13:58,600 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139949275346688)>} 2021-11-15 13:13:58,600 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x 7f487be71bc8>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 139949275346688)>}

bwduncan commented 3 years ago

Depending on how it's started it might not use the shebang. Can you check by looking at the output of ps aux? It should show whether it's using python or python3. You could also start it manually so that we can be sure it's using python3.

Is the legacy code part of emonhub? If it's open source I'd be happy to help port it, although I admit you might be wary of my offer given this bug :rofl:

I'm not dismissing it as a possible effect. We could try setting it to extreme values, like 0, or 500, and see if it really is affecting anything. Yes Karioja is just using the pyserial module to wait until there is data to be read. We have other tasks to perform so we can't wait forever. Actually we should be setting a timeout on that read call, but that's for another day...

I have been looking at the two implementations and trying to see where they have diverged, and why it might be important. Will report back!

bwduncan commented 3 years ago

Argh remote debugging is hard!

Can you put a print(self.__dict__) line in there? I need to see what the state of that object is.

eroyee commented 3 years ago

Hah! Great minds 'n all that - I started it from the cli directly with python3 a minute or so ago....

.... and it's just done it again:

2021-11-15 13:26:42,989 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf0a93e88>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:42,989 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,005 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf0a50888>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,005 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,005 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf09f5dc8>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,005 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,005 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf09f5f88>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,005 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,006 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf0a93e88>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,006 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,021 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf0a50888>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,021 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,021 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf09f5dc8>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,021 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,021 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fcaf09f5f88>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>} 2021-11-15 13:26:43,021 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140509572482816)>}

Thanks for the offer re the other code, but it's not emonhub, and btw I'm certainly not dissing you for this bug! Oh my, you should see my pitiful efforts and what havoc they wreak!

If it helps I'm happy to call you wherever you are (that is as long as my VOIP service costs are reasonable to your country :-) and possibly just give you ssh access to the machine, or follow your suggestions in realtime. It might be easier than messaging?

On 15/11/21 1:25 pm, bwduncan wrote:

Depending on how it's started it might not use the shebang. Can you check by looking at the output of |ps aux|? It should show whether it's using |python| or |python3|. You could also start it manually so that we can be sure it's using python3.

Is the legacy code part of emonhub? If it's open source I'd be happy to help port it, although I admit you might be wary of my offer given this bug 🤣

I'm not dismissing it as a possible effect. We could try setting it to extreme values, like 0, or 500, and see if it really is affecting anything. Yes Karioja is just using the pyserial module to wait until there is data to be read. We have other tasks to perform so we can't wait forever. Actually we should be setting a timeout on that |read| call, but that's for another day...

I have been looking at the two implementations and trying to see where they have diverged, and why it might be important. Will report back!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openenergymonitor/emonhub/issues/160#issuecomment-968401143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG6JE47PGVMVFEZXTQFPEGTUMBHQFANCNFSM5DXB6GWA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

eroyee commented 3 years ago

Sure, sorry if it's obvious but whereabouts exactly?

Also see my prev comment re call...

On 15/11/21 1:26 pm, bwduncan wrote:

Argh remote debugging is hard!

Can you put a |print(self.dict)| line in there? I need to see what the state of that object is.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openenergymonitor/emonhub/issues/160#issuecomment-968401368, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG6JE47T3PHIY7SCZEB6ZHLUMBHTTANCNFSM5DXB6GWA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

bwduncan commented 3 years ago

Oh sorry actually it's more involved than that. Inside that while block, can you add before the frame = frame.tb_next

self._log.debug("%s", self.__dict__)

And yeah I'd be happy to call, if we can exchange numbers in a less public forum, but it's nearly 1am now and it's a school night :smile: I should be free tomorrow evening after work (1900-0000 UTC as it stands hopefully).

eroyee commented 3 years ago

Ok, I'll do that shortly.

I can make whatever time work tomorrow. I think I've sent you an email to your work address - if it's correct we may (amusingly) have trouble understanding each other, but let's give it a go anyway ;-)

On 15/11/21 1:43 pm, bwduncan wrote:

Oh sorry actually it's more involved than that. Inside that |while| block, can you add before the |frame = frame.tb_next|

self._log.debug("%s",self.dict)

And yeah I'd be happy to call, if we can exchange numbers in a less public forum, but it's nearly 1am now and it's a school night 😄 I should be free tomorrow evening after work (1900-0000 UTC as it stands hopefully).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openenergymonitor/emonhub/issues/160#issuecomment-968406728, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG6JE465WZ7AVGWJHKGCBILUMBJS7ANCNFSM5DXB6GWA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

eroyee commented 3 years ago

@bwduncan When it goes it fair fills up the log file, and then rotates real quick, so I keep losing the start - however here's an idea of what we get:

2021-11-15 16:21:49,742 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read()) {'formattedTb': [' File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input\n self.dict[self.key.decode()] = self.value.decode()\n'], 'frame': <traceback object at 0x7fa7f2704188>, 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib/python3.6/traceback.py'>, 'packet': None, 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140359312283392)>} 2021-11-15 16:21:49,744 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 70, in input self.dict[self.key.decode()] = self.value.decode() {'byte': b'\r', 'self': <EmonHubVEDirectInterfacer(VEDirect, started 140359312283392)>} 2021-11-15 16:21:50,647 DEBUG VEDirect {'_log': <Logger EmonHub (DEBUG)>, '_target': None, '_name': 'VEDirect', '_args': (), '_kwargs': {}, '_daemonic': False, '_ident': 140359312283392, '_tstate_lock': <locked _thread.lock object a t 0x7fa7f3fc2b20>, '_started': <threading.Event object at 0x7fa7f3f45048>, '_is_stopped': False, '_initialized': True, '_stderr': <_io.TextIOWrapper name='' mode='w' encoding='UTF-8'>, '_defaults': {'pause': 'off', 'interval': 0, 'datacode': '0', 'scale': '1', 'timestamped': False, 'targeted': False, 'nodeoffset': '0', 'pubchannels': [], 'subchannels': [], 'batchsize': '1'}, 'init_settings': {'com_port': '/dev/ttyUSB1', 'com_baud': '19200', 'toextract': ['V', 'V PV', 'PPV', 'I'], 'poll_interval': '10'}, '_settings': {'pause': 'off', 'interval': 0, 'datacode': '0', 'scale': '1', 'timestamped': False, 'targeted': False, 'nodeoffset': '9', 'pubchannels': ['ToEmonCMS'], 'subchannels': ['ToSmartSolar MPPT'], 'batchsize': '1'}, '_sub_channels': {}, '_pub_channels': {'ToEmonCMS': []}, '_interval_timestamp': 0, 'buffer': <emonhub_buffer.InMemoryBuffer object at 0x7fa7f3f450b8>, '_item_limit': 1000, 'stop': True, '_ser': Serial<id=0x7fa7 f3f450f0, open=True>(port='/dev/ttyUSB1', baudrate=19200, bytesize=8, parity='N', stopbits=1, timeout=10, xonxoff=False, rtscts=False, dsrdtr=False), 'header1': b'\r', 'header2': b'\n', 'delimiter': b'\t', 'key': bytearray(b'\x00VPVPIDFW SER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWS ER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSE R#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER

VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER

VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#V IVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VI VPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIV PVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksumPIDFWSER#VIVPVPPVCSMPPTORERRLOADH19H20H21H22H23HSDSChecksum'), 'value': bytearray(b'3258\x940xA057154HQ2035I1QX113360920031210125320x000000000OFF1545313548697500197\x9a0xA057154HQ2035I1QX113360890032980125320x000000000OFF1545313548697500197\x850xA057154HQ2035I1QX113370910032070125320x000000000OFF154531354869 7500197\x950xA057154HQ2035I1QX113370910030320125320x000000000OFF1545313548697500197\x990xA057154HQ2035I1QX113360920031970126320x000000000OFF1545313548697500197\x8c0xA057154HQ2035I1QX113360930031090126320x000000000OFF1545313548697500197\x 920xA057154HQ2035I1QX113370920031060126320x000000000OFF1545313548697500197\x950xA057154HQ2035I1QX113370930031860127320x000000000OFF1545313548697500197\x8b0xA057154HQ2035I1QX113360930031840127320x000000000OFF1545313548697500197\x8e0xA0571 54HQ2035I1QX113370940031030127320x000000000OFF1545313548697500197\x950xA057154HQ2035I1QX113370920032680127320x000000000OFF1545313548697500197\x8b0xA057154HQ2035I1QX113380930031040127320x000000000OFF1545313548697500197\x940xA057154HQ2035I 1QX113370930031770128320x000000000OFF1545313548697500197\x8a0xA057154HQ2035I1QX113370930031810128320x000000000OFF1545313548697500197\x8f0xA057154HQ2035I1QX113360940031010128320x000000000OFF1545313548697500197\x970xA057154HQ2035I1QX113380 920032590128320x000000000OFF1545313548697500197\x890xA057154HQ2035I1QX113380940031030128320x000000000OFF1545313548697500197\x930xA057154HQ2035I1QX113380930030150128320x000000000OFF1545313548697500197\x920xA057154HQ2035I1QX113360940031670 128320x000000000OFF1545313548697500197\x8b0xA057154HQ2035I1QX113370930032530128320x000000000OFF1545313548697500197\x8f0xA057154HQ2035I1QX113370940030870129320x000000000OFF1545313548697500197\x880xA057154HQ2035I1QX113380940031670129320x00 0000000OFF1545313548697500197\x880xA057154HQ2035I1QX113370930030000129320x000000000OFF1545313548697500197\x980xA057154HQ2035I1QX113370940031530129320x000000000OFF1545313548697500197\x8e0xA057154HQ2035I1QX113370940032370129320x000000000OF F1545313548697500197\x8b0xA057154HQ2035I1QX113380950030760130320x000000000OFF1545313548697500197\x900xA057154HQ2035I1QX113360930030000129320x000000000OFF1545313548697500197\x990xA057154HQ2035I1QX113360930033010129320x000000000OFF15453135 48697500197\x950xA057154HQ2035I1QX113370950032210129320x000000000OFF1545313548697500197\x910xA057154HQ2035I1QX113370940030670130320x000000000OFF1545313548697500197\x920xA057154HQ2035I1QX113370950032160130320x000000000OFF15453135486975001 97\x95:A4F10000100000000005D3C00005D3C0000A31196051EE304FFFFFFFFFFFFFFFFFFFFFFFFFF72\n:A5010000087000000FFFFFFFF8C0520050000000000C20078002301E60100004F012710C50021\n0xA057154HQ2035I1QX113360930033000130320x000000000OFF154531354869750019 7\x9e'), 'bytes_sum': 316214, 'state': 0, 'dict': {'PID': '0xA057', 'FW': '154', 'SER#': 'HQ2035I1QX1', 'V': '13480', 'I': '13000', 'VPV': '\x0032940', 'PPV': '180', 'CS': '5', 'MPPT': '1', 'OR': '0x00000000', 'ERR': '0', 'LOAD': 'OFF', '\x00H19': '15452', 'H20': '134', 'H21': '486', 'H22': '97', 'H23': '500', 'HSDS': '197', 'Checks\x00um': 'c'}, '_extract': ['V', 'VPV', 'PPV', 'I'], 'poll_interval': 10.0, 'last_read': 1636946478.085835} 2021-11-15 16:21:50,648 DEBUG VEDirect File "/opt/openenergymonitor/emonhub/src/interfacers/EmonHubVEDirectInterfacer.py", line 137, in _read_serial packet = self.input(self._ser.read())

I've also noted that the checksum errors tend to occur every 3mins or so, reasonably regularly...

eroyee commented 3 years ago

@bwduncan Just sent an email to what I think is your personal address. I mentioned 'this evening', by which I mean your evening (the 1900-0000 UTC you mentioned), not my evening ;-)

If you didn't get that then I've managed to confuse someone else, and we'll just have to figure a suitably private way of exchanging contact details...

bwduncan commented 3 years ago

It looks like the data has extra NUL bytes in it. Near the end of the object dump you will see that dict contains a 'Checks\x00um' key, with value 'c'. It looks like the value of 'VPV' and the key '\x00H19' also start with unexpected zero bytes.

Presumably the reason that Karioja's code doesn't fail is that it doesn't try to make sense of the data. It's just taking the data as-is and putting it into a string. If you print strings with nul bytes in them in Python you won't see them, but obviously if you try to get the data, or compare to a known string, it will fail: For example 'Check\x00sum' != 'Checksum' but print('Check\x00sum') prints (what looks like) "Checksum".

This https://stackoverflow.com/a/27477565/567655 gives some reasons why these bytes might be received. Maybe the serial line is noisy, too long, damp, ...

We should also be parsing the : ("HEX") lines. That would help with some checksum errors. I'll prepare a patch.

eroyee commented 3 years ago

I'm fairly sure the serial line should be ok - it's direct to the machine and ~3M, so well within limits (and I'd tried several cables just to be sure). Still, not impossible.

Good idea re the hex data though, I had wondered if that might be messing with things a bit.

acoleman-apc commented 2 years ago

I recently upgraded and ran into this issue. While not a great solution, I ended up just stripping out the non UTF-8 chars and letting the parsing fail. It does succeed periodically (90-120 seconds) so I get a data feed.

Example code: self.dict[self.key.decode()] = self.value.decode('utf-8').strip()

This was this previously: self.dict[self.key.decode()] = self.value.decode()

bwduncan commented 2 years ago

Hey,

Thanks for your feedback. Unfortunately that's not what the strip method does. strip removes leading and trailing whitespace from a string, it will not remove invalid UTF8 characters. Also, 'utf-8' is the default for decode, so specifying it explicitly doesn't help.

Bruce

acoleman-apc commented 2 years ago

I spoke too soon! My errors have appeared again so please ignore me.

eroyee commented 2 years ago

Anthony,

@bwduncan and I spent quite a session on this one night. We, or at least Bruce, was able to come up with a work-around that meant one was able to operate Openenergymonitor with a VEDirect device, but it was probably not completely suitable as a final solution.

In order to make it work you need to use the requisite library from https://github.com/karioja/vedirect with modified interfacer code. If @bwduncan is ok with it I'm happy to pass on the details to you, although I think you'd need to consider it as 'work-in-progress'. An alternative option is just to use karioja's code and publish to MQTT, which I successfully used for a while.

All of this happened a few months ago and I've not really done anything since, however FYI I came to the view that the probable issue was due to the nature of the VEDirect data stream and the start of the parsing algorithm in use here. My recollection is a bit wobbly now but I think that every second or so Victron device sends out HEX data, usually followed by a ln, then the ASCII data that's parsed for sending on via Emonhub.

In terms of the issue here I think that the issue is around using the ln as the start marker - it may be more reliable to look for the end of the HEX data as the effective start of the ASCII data stream. However please bear in mind this is simply my surmise, albeit after quite some time messing with it, and I mention it simply in case you decide to take a look at what's going on...

acoleman-apc commented 2 years ago

Thanks for the reply. So far, I'm very disappointed with the upgrade. My memory card wouldn't upgrade in place so I had to manually back up and upgrade the emon pi to the latest version and then I had to re-write the Renogy interfacer to get that working and now my VE Direct interface no longer works - meaning that the emon pi no longer serves its purpose. So I'm going to look for a work around or try to code one but I'm inclined to replace the Renogy charge controller with a victron unit and just move away from emon pi in favor of the victron platform as I didn't really see any improvements in emon pi after the upgrade.

acoleman-apc commented 2 years ago

I've changed my code to this (and have started testing - fingers crossed): self.dict[self.key.decode()] = self.value.decode(encoding="utf-8", errors="replace")

bwduncan commented 2 years ago

self.dict[self.key.decode()] = self.value.decode(encoding="utf-8", errors="replace")

Yes that will avoid the topmost exception, although you probably need errors="replace" for the key, too, and again utf-8 is the default, but I don't think it will give you any more data. The data is still garbage, so it will fail the checksum. You could patch that out as well, but if the data isn't valid UTF-8 (it's 7-bit ASCII, except the checksum byte, so if it's not UTF-8 then it is really garbage) then what is it? From memory (months ago) of what @eroyee and I found was that zero bytes were being randomly dropped into the stream, which didn't make any sense. I think this has always been the problem, but the previous code just silently ignored the error. Was that better? I would argue not. Clearly we're all missing something.

I will be interested to see your results though.

You may have had to re-write your interfacer to get it working, but nobody else had made any effort to port all this to Python 3, and as I think I made fairly clear at the time, I don't have the hardware to test, so I'm relying on the community to provide feedback. I'm not being paid to do this, it is "fun". If you want to move to a different platform, that's up to you.

acoleman-apc commented 2 years ago

Moving to latin encoding with error ignoring: if self.key.decode(encoding="latin-1", errors="replace") == 'Checksum': self.dict[self.key.decode(encoding="latin-1", errors="replace")] = self.value.decode(encoding="latin-1", errors="replace")

Seems to do much better. Although it isn't 100% as the serial connection (based on my debugging) doesn't always pop in at the proper part of the frame - thus causing some data to get ignored - it does seem to give me rather consistent data feed so I'll continue to monitor.

eroyee commented 2 years ago

Anthony,

Thanks for the update, Bruce thanks for the reminder.

Just to comment on your earlier post. I too could have used another platform, however I liked that OpenEnergyMonitor is... open, whereas my recollection of the Victron ecosystem is that it's proprietary? At least with OEM one has the option to use other equipment, and improve/adapt as needed - and if we can help others along the way that's a plus, in my view, thus thanks for persevering here 👍

So, Bruce is right in that there may be something else going on we've not yet put our finger on completely. It would be good to determine that and correct it if possible, but the meantime the pragmatist says "just it get it working as much as possible", particularly since it appears to effectively not work as it is at present. To that end if your mods work to assist the latter it's good to hear as it's probably simpler than where Bruce & I ended up.

However to begin addressing the real issue a couple of things are of interest; (1) as I understand it you were previously using exactly the same hardware with no apparent issue - the only change being in software? (2) If 1, do you have any log files from the previous iteration of the VEDirectInterfacer? I ask 2, just in case there were a few unidentified hiccups that could have been occurring, but not enough to notice in use. I suppose that could assist in determining if the issue has transferred over from previous code, but is now more apparent, or it previous code didn't exhibit the issue at all (in which case, why not).

FYI in my case I didn't have a previous version running, I simply created a new instance in order to provide some plots from my inverter & MPPT etc, ran across this issue and did as much as I was able to assist in the fix (Bruce is really the guru here, not me!) - hence your prior experience is of interest. At one point I did seriously begin to wonder if it was just my setup but I later installed it to a Pi where it did the same, and with your info now it does seem it's a real 'thing'.

Anyway, in case I can help further I've got the Pi out again and will connect it up to confirm where it's at. If nothing else I can then try your mods to see what they do here. I can test a Victron MPPT, Inverter, and Monitor, although my recollection is that it didn't seem to make any difference to the issue which was connected.

eroyee commented 2 years ago

It's been a while since I dealt with this so had to find that Pi, re-learn how it all worked, restore the complete mess of modified conf files with something more like it should be, stop Influx/Grafana (which at one time I'd been playing with as an alternative) but I've now got it back up and running...

As a result I remember some things it would do. With the original code in place, often it'd work ok for several minutes with the occasional 'Invalid checksum' in the log. I don't recall there being a particular pattern to this, but it would eventually crash completely after a series of these errors.

Anthony, I've modified EmonHubVEDirectInterfacer.py with your suggestions. ATM I'm getting the errors at seemingly random intervals, but I doubt it's been going long enough to determine whether it'll fully crash, or not - IIRC it could take 15-20mins for that. Presently it's been running for ~12m with approx 14 errors recorded...

acoleman-apc commented 2 years ago

The communication protocol should be the same but FYI I'm using the BMV 712 shunt, not the MPPT or other devices from Victron.

With the older version, I was able to have an intermittent but stable feed of data - no crashing. I do recall having crashing issues and having to modify it - I even had an issue that is now closed.

With my recent changes (latin encoding w/ error suppression), I'm going on about 20 hours with a decent feed of data. There are periodic errors but no crashing and the data seems to match the victron data (app & display).

I'm not opposed to the open nature of the project - I wrote the Renogy interfacer for my charge controller and shared it. I'm just frustrated with the python 3.x upgrade. This is the first version of python that isn't backwards compatible and I haven't seen many benefits to the upgrade in the Emon system, at least not for me personally. Anyway, thanks for the code contributions and helping to build the community.

eroyee commented 2 years ago

Agree re the protocol, albeit it's useful to be able to test a few different inputs just in case that makes a difference.

Good to hear your changes have success of sorts. I've taken mine back to how it was (using the kairoja module) because I felt I was getting too many errors following changes per your code, after which it crashed. However because this is a somewhat modified installation, and it's now some months since I last looked at it, I want to revisit this in case there was some other reason for the failure.

Good on you for sharing the Renogy interfacer. Understand re Python 3, at the risk of bringing the world down upon my head I'd have to say my preference remains 2.7 if there's something I want to do...

acoleman-apc commented 2 years ago

So I've replaced the renogy rover 100 amp unit with a victron smart solar 150v 100 amp unit and having errors using the VEDirect interfacer to read data from the MPPT. Any chance we can get victron to look at this with us?

acoleman-apc commented 2 years ago

from the log: Invalid checksum for data: {':ADBED001A0D5C\n:A0202000200000045\n\r\nERR': '0', 'LOAD': 'ON', 'Relay': 'OFF', 'H19': '1652', 'H20': '766', 'H21': '2852', 'H22': '886', 'H23': '2864', 'HSDS': '1'}

acoleman-apc commented 2 years ago

From page 2 of the VE.Direct-Protocol-3.32.pdf: "On power up, a VE.Direct interface will always be in Text-mode, and continuously transmits all run-time fields. As soon as it receives a valid HEX-message, it will switch to HEX-mode. It will stay in HEX-mode as long as HEX-messages are frequently received. After a product has not received any valid HEX-messages for several seconds, it will switch back to Text-mode and start to auto transmit the run-time fields periodically again. Some products will send Asynchronous HEX-messages, starting with “:A” and ending with a newline ‘\n’, on their own. These messages can interrupt a regular Text-mode frame."

eroyee commented 2 years ago

I doubt Victron would be interested, but you never know, might be worth sending them a message with a reference to this issue?

Unfortunately I've set the Pi aside again and would need to go relive the process in order to remind myself of what I used to get (other than what I showed above) however it may be that my unit doesn't send the Async hex message anyway - is there anything more specific on this than 'some products'?

Otherwise I'm not clear on whether you're just logging the issue yet the system continues, or whether it all grinds to a halt?

acoleman-apc commented 2 years ago

The data feed continues on the wire but the code only randomly publishes to the emon CMS feed as it has the check sum errors and discards the packets - most likely due to the hex messages appearing in the string. Since the victron cerbo gx can run on a pi with the same hardware, I think it is a bug / problem with the interfacer. When I get some free time, I'm going to look into updating the code to detect and remove the hex message from the string before calc of check sum.

However, there is another problem with the code. With the MPPT not all of the data comes across in one frame - it is divided up into multiple frames. And the code in the current interfacer doesn't seem to handle this properly as it publishes to the wrong feeds when this happens. So for instance H20 (solar power generated today) comes across in the 2nd frame but won't ever get pushed to H20 feed as far as I can tell.

TLDR: the victron interfacer has major problems on the blue solar mppt units and minor problems on the BMV712 (I have both). The renogy interfacer was rock solid though so I miss the data feed even though the victron units are better quality and produce more power for me.

eroyee commented 2 years ago

Ok, I'm sure people would appreciate it if you could fix this 👍

Unfortunately Bruce doesn't actually have any Victron gear, and I'm fairly sure mine doesn't send the async data (and I've re-utilised the Pi I was using for something else), which makes it difficult to assist. However I can provide serial data from a Smart Solar MPPT, a Phoenix Inverter, and a BMV-702 if/when you need it - although it looks as if I've already presented some data from the MPPT above.

Otherwise I don't recall having any issue with the feeds. I think at one point I was feeding two Victron devices and my own battery monitor (which, immodestly, I think is somewhat better than the Victron unit :) and, aside from the issue described here, the system ran quite reliably for weeks.

I agree the Victron units are quality gear, and I'm very happy with their performance (much less so with the country importer - which, sadly, is why I can't recommend them here). However why manufacturers find it necessary to utilise weird protocols for something simple like this I can never figure out - I've been dealing with such hardware for many years now and still there seems to be no common (and open) format across manufacturers. At least in terms of user control and data interchange this isn't anything secret so really, why!? Grrr.