letscontrolit / ESPEasy

Easy MultiSensor device based on ESP8266/ESP32
http://www.espeasy.com
Other
3.28k stars 2.22k forks source link

P078 EASTRON SDM120C receiving NAN with SoftSerial #2283

Closed D-t-M closed 5 years ago

D-t-M commented 5 years ago

Hello,

I have problems getting the readings to my SDM120C Modbus meter running. I don´t know, if this is a bug or if I am lacking information, or if my setup is wrong.

I tried to fill out the checklist below to provide all required information.

Thank you for your support in advance! Best regards, D-t-M

Checklist

I have...

Steps already tried...

Summarize of the problem/feature request

I am getting sporadically nan readings in different values/readings when using the P078 EASTRON SDM120C plugin, which is connected via SoftSerial to my D1 Mini Board (Pins D6,D7). Tested with 3 different nodeMCU boards, and 2 different RS485 Interfaces, with 4 different Power Supplies, even in combination with additional 100nF Capacitors both to 3,3V and 5V lines.

Expected behavior

I expect to get stable readings (no nan) or maybe I it´s possible to use the hardware serial.

Actual behavior

I am getting sporadically nan readings in different values/readings when using the P078 EASTRON SDM120C plugin, which is connected via SoftSerial to my D1 Mini Board (Pins D6,D7). Tested with 3 different nodeMCU boards, and 2 different RS485 Interfaces, with 4 different Power Supplies, even in combination with additional 100nF Capacitors both to 3,3V and 5V lines. Modbus termination (120Ohm) does not affect the behaviour, cable lenght is 4m.

Steps to reproduce

  1. Configure P078 Plugin with SoftSerial (D6,D7) automatic FLow Control, interval 10 or 20 seconds
  2. Check the Status in Device Tab or in Log

System configuration

Hardware:

ESP Easy version: ESP_Easy_mega-20190202_test_core_250_beta_ESP8266_4096_VCC.bin (tested with core 2.4.2 build as well, no effect)

ESP Easy settings/screenshots:

Rules or log data

89600887: EASTRON: (1,0) Voltage (V): 224.90 89601048: EASTRON: (1,0) Frequency (Hz): 50.00 89601215: EASTRON: (1,0) Power Factor (cos-phi): -0.05 89601378: EASTRON: (1,0) Current (A): 0.06 89601382: EVENT: PV#V=224.90 89601417: EVENT: PV#Hz=50.00 89601423: EVENT: PV#W=-0.05 89601429: EVENT: PV#Wh=0.06 89610900: EASTRON: (1,0) Voltage (V): 224.60 89611059: EASTRON: (1,0) Frequency (Hz): 49.95 89611225: EASTRON: (1,0) Power Factor (cos-phi): -0.05 89611763: EASTRON: (1,0) Current (A): nan 89611768: EVENT: PV#V=224.60 89611806: EVENT: PV#Hz=49.95 89611816: EVENT: PV#W=-0.05 89611822: EVENT: PV#Wh=nan 89620878: EASTRON: (1,0) Voltage (V): 224.50 89621041: EASTRON: (1,0) Frequency (Hz): 50.00 89621206: EASTRON: (1,0) Power Factor (cos-phi): -0.04 89621370: EASTRON: (1,0) Current (A): 0.06 89621375: EVENT: PV#V=224.50 89621413: EVENT: PV#Hz=50.00 89621421: EVENT: PV#W=-0.04 89621428: EVENT: PV#Wh=0.06 89625552: EVENT: Clock#Time=Mon,21:21 89625944: WD : Uptime 1494 ConnectFailures 0 FreeMem 11080 WiFiStatus 3 89630889: EASTRON: (1,0) Voltage (V): 224.70 89631052: EASTRON: (1,0) Frequency (Hz): 50.00 89631218: EASTRON: (1,0) Power Factor (cos-phi): -0.04 89631755: EASTRON: (1,0) Current (A): nan 89631760: EVENT: PV#V=224.70 89631797: EVENT: PV#Hz=50.00 89631806: EVENT: PV#W=-0.04 89631813: EVENT: PV#Wh=nan 89641259: EASTRON: (1,0) Voltage (V): nan 89641422: EASTRON: (1,0) Frequency (Hz): 50.00 89641588: EASTRON: (1,0) Power Factor (cos-phi): -0.04 89641752: EASTRON: (1,0) Current (A): 0.06 89641756: EVENT: PV#V=nan 89641794: EVENT: PV#Hz=50.00 89641803: EVENT: PV#W=-0.04 89641809: EVENT: PV#Wh=0.06

Screenshot from Config:

<img width="937" alt="device-config2" src="https://user-images.githubusercontent.com/43365590/52235934-bdd48280-28c5-11e9-8374-6430a1add2bd.PNG">
<img width="492" alt="device-config" src="https://user-images.githubusercontent.com/43365590/52235937-bdd48280-28c5-11e9-9be8-c917bd20f88e.PNG">
<img width="950" alt="devices" src="https://user-images.githubusercontent.com/43365590/52235938-be6d1900-28c5-11e9-8c38-6efc2299cbcf.PNG">
TD-er commented 5 years ago

I need:

That's the absolute minimum to be able to do something with this issue.

Edit: This reply of me was a bit premature since the start post was being edited while I posted.

D-t-M commented 5 years ago

Sorry for the first empty post. I hit the wrong buttons and the still empty post was online. I tried to fill out more information using the edit function!

D-t-M commented 5 years ago

Hello gain,

OK, now all the information should be online. Sorry again for the inconveniences!

What I additionally tried was using the hardware serial (RX/TX) lines, which causes only one single reading from the first value and then stopped at all. I even did noch see any flickering of the TX LED on my RS485 interface. But this is maybe a different topic.

Best regards, D-t-M

TD-er commented 5 years ago

I am adding CRC pass/fail notes to other plugins using serial communications too, to get an idea on how often those fail. For the GPS plugin it is near impossible to use software serial because of the amount of failed messages. Those failure rates are between 50 and 90%, but those messages are longer and the GPS is quite chatty. When using HW serial the number of read lines with CRC failures is near zero. For example one of my test nodes has handled 5 million lines with 14 CRC errors using HW serial.

Can you use HW serial using GPIO 13 and 15? (That's "Serial0 swapped") Then you have to disable the serial log to make it work.

D-t-M commented 5 years ago

I tried to use the hardware serial with the swapped pins, too, but when rebooting the node it got stuck, because there is a pull-up on one of the lines in my RS485 interface.

But maybe I can try connecting the RS485 interface to the GPIO 13 and 15 after booting for testing…. give me a few minutes, please.

TD-er commented 5 years ago

Hmm, that's right GPIO15 is a bit tricky at boot.

TD-er commented 5 years ago

If that may be working, then you could try to add some circuitry to only connect the RX of the modbus when boot has finished. I'm afraid that will cost you another GPIO pin, but it may be worth it? Another idea could be to extend the ESPeasySerial wrapper to allow a mix of SW serial and HW serial. Then receiving can be defined on HW serial and sending on SW serial. (or even Serial1 which can only use a TX pin)

D-t-M commented 5 years ago

Ok, I hooked the RS485 interface to GPIO13 and GPIO15, reconfigured the device but I don´t get any transmission at all (no RX and TX LED light up, all values NAN).

I disabled the "serial output" in the Advanced options already to not interfere with the Modbus communication.

Somehow it looks like the hardware serial is not working at all in the Plugin. It does for debugging with an USB cable and a terminal running in my laptop.

TD-er commented 5 years ago

Are you sure TX and RX are the correct way? GPIO15 should be connected to the RX of the sensor/RS485 adapter. You may also want to have a look at the pull-up configuration. See also our documentation GPIO - best pins to use on esp8266 GPIO15 is a bit strange, since it may be actively pulled down on your board.

You can also try the normal Serial0 pins (GPIO 1 & 3) which do not have this strange config. But then the sensor may see some boot logs and there is no guarantee what will be set/changed on the connected Modbus slave or sensor.

D-t-M commented 5 years ago

Yes, RX of the RS485 is connected to GPIO15, and TX of the RS485 interface is connected to GPIO13. Both lines are showing no activity at the LED on the interface.

When using the RX/TX lines - even if there is some traffic during booting - I at least see the TX LED blinking once. Compared to the SoftSerial config I see this TX LED blinking 4 times followed each by one blink of the RX LED.

So I think the HW Serial 0 seems to send only one request to the Modbus. I even can see the very first value presented followed by NANs.

TD-er commented 5 years ago

When the transmission to the sensor is not occurring, then there will also be no data sent the other way around. It is a master/slave protocol, so you have to actively request the data. So that makes sense.

I will have a look at the code for GPIO15 in the ESPeasySerial library. Some user also reported the Nextion plugin was giving issues with these changes, so it is possible there actually is a bug in there.

D-t-M commented 5 years ago

Just tried the RX/TX Pins of HW serial 0 and managed to get a screenshot of the device tab.

device_firstreading

This reading works one after clicking on "submit" in the device configuration. Next interval everything is NAN and there is no TX LED showing up at all.

Can I maybe try an older build with an older ESPeasySerial library to compare the behaviour and so get more hints?

TD-er commented 5 years ago

You could try the 20181231 build That's the last one before I added the ESPeasySerial.

D-t-M commented 5 years ago

I flashed ESP_Easy_mega-20181231_test_core_250_beta_ESP8266_4096_VCC.bin The config stayed unchanged and the RS485 interface is connected to RX/TX (GPIO 1 and 3). All 4 values are showing up. I´ll keep it running like this and see, if there are any NAN values. Seems like there is something different between these two builds.

TD-er commented 5 years ago

There is a lot of changes between those builds, since I introduced the SW/HW serial abstraction then But now I come to think of it, I only tested with sensors that will read on the HW serial

D-t-M commented 5 years ago

OK, let me know when I can help with some tests or something like this. Pooringly my programming skills are very basic and I even did not manage to compile my own build yet. So I guess I can´t help on that level :-(

Thank you very much for your work and your support! I discovered ESPeasy 4 weeks ago and I got a fan immediately, running 7 nodes already with OLED displays, temperature and humidity monitoring and another energymeter using S0 pulsecounter already :-)

TD-er commented 5 years ago

I think I will split those plugins really needing the HW serial into another PR and then revert those changes back to the 20181231 build. That will make a stable ground to work on and try to fix what's apparently wrong with the TX line.

I just opened a new beer, so maybe I can see the bug later this night, but if not then I will split those and revert the SDMxx and Nextion plugin to 20181231

D-t-M commented 5 years ago

Enjoy your beer and good luck with bug hunting ;-)

TD-er commented 5 years ago

Probably found the problem. Let's hope I can make some fix and then I will make a new test build first.

TD-er commented 5 years ago

Could you try this test build ?

D-t-M commented 5 years ago

Hello, I uploaded the test build and get values from the SDM120 using the HW serial 0 (GPIO 1 and 3). No NAN values since 10 minutes now. Using the swapped pins (GPIO 13 and 15) i see no activity in the TX LED at all. I´ll have a closer look at this this evening after I'm back from office. Thank you very much for the test build!

TD-er commented 5 years ago

So at least for GPIO 1&3 (Serial0) it is now working and Serial0_swapped not yet. For GPIO15 it may be an issue with pull-down of the pin on the board (it has to be low during boot)

While connected to GPIO 1&3, do you have any issues during boot?

fluppie commented 5 years ago

I use a SDM220-MODBUS on D5 and D6 with the 1812 2018 build. I'll try to update to the newest and check if it stops working. afbeelding

fluppie commented 5 years ago

That went wrong quickly :) with 02 02 2018 afbeelding Looks like I'll revert to 18 12 2018. I also have issues with the newer builds and senseair s8 sensors. I use 18/12/2018 for them as well.

fluppie commented 5 years ago

With your special test build: afbeelding

P.S.: Regarding the issues with Senseair S8: https://www.letscontrolit.com/forum/viewtopic.php?f=5&t=3470&start=50

TD-er commented 5 years ago

I am looking into the SenseAir plugin as we speak. One of the issues with it was that it didn't check the CRC of incoming messages.

About the Eastron plugin. The main change with the test build I made last night was to be able to use HW serial again (was broken since 20190101). SW serial is still a bit unreliable so it is not really strange to see a NaN every now and then. I am looking into the other plugins using serial port, so I will also have a look at this one.

fluppie commented 5 years ago

I agree, but with these low baud rates of 2400/4800/9600 I think it should work just fine?

TD-er commented 5 years ago

My GPS plugin is also working on 9600 baud, but on SW serial the failure rate is 50 - 90% Those messages are a lot longer (upto 80 bytes) so there is a higher chance to miss a bit. HW serial is working a lot better (~10 fails in 5 million lines)

It also depends on other activity of the node.

I will add an indicator in the Eastron plugin to see how often a CRC error is detected in the Modbus communication.

TD-er commented 5 years ago

I just added the CRC pass/fail stats to the plugin page (not tested) Also the Modbus call to read a value may retry up-to 3 times to get a value if there's something wrong.

See this commit: https://github.com/letscontrolit/ESPEasy/pull/2235/commits/3b1dd8f21f05fd047c35d1bb2753438df64edca4 I am making a test build as we speak for this. It will be ready in about 30 minutes.

TD-er commented 5 years ago

New test build It is including the linked commit of my previous post.

fluppie commented 5 years ago

OK, it's running for 1minute now: Checksum (pass/fail): | 40/7

TD-er commented 5 years ago

I guess the 3 retries now take care of hiding the NaN values? Or did you see them still?

fluppie commented 5 years ago

Looks like I don't see them. At the moment: Checksum (pass/fail): | 936/158

D-t-M commented 5 years ago

Hi, I just managed to get the second test build running. The OTA did not work, the Node was off and I had to reset it. After resetting I found out, that still the build from 20181231 was on it. So even the first test build this morning was not on the unit :-( OTA did not work at all, even after deactivating the EASTRON plugin and unplugging the RS485 interface. Now after flashing via USB cable, the test build 2 shows the same behaviour as the lates official build:

The very first value is passed and is then followed by checksum errors / nan. This can be observed in the new CRC stats: Checksum (pass/fail): 1/33

On a second node I could flash the test build OTA without problems. But an RS485 interface (without meter) hooked to the swapped HW serial (GPIO 13, GPIO15) does not trigger any TX LED at all. Booting stucks due to the pull-up of the interface (so I connect it after booting).

Untested is the Software Serial setup with the 3 retries at CRC errors yet. If that helps, I can reconfigure an test it with the meter.

Looks like the fix is not working yet :-(

TD-er commented 5 years ago

I started to build te same setup on my bench, so tomorrow I can also test myself and will also look at the signals with the scope to see what's happening. (e.g. maybe a pull-up config needed or internal pull-down disabled)

fluppie commented 5 years ago

I see no NaN's in my log :). Checksum (pass/fail): | 6080/1021

TD-er commented 5 years ago

At least the NaN part of this issue will be fixed when I merge my open PR #2235 The HW serial part is still an issue, but that may also be related to the used 485 converter board I used. (pull-up/pull-down resistor issue) So if you don't mind, I close this one for now and the HW serial will also be dealt with in other issues.

D-t-M commented 5 years ago

Hello, that´s ok for me, closing this issue here. 2 Questions:

Thank you for your help!

Best regards, D-t-M

D-t-M commented 5 years ago

found the reference after removing the tomatoes from my eyes :-)

Both questions are answered! Thank you!

D-t-M commented 5 years ago

Hello, to give an feedback on the release 20190116 I tested: This release is running with HW serial on D9/D10 stable now for 21 hours. I had no reboots (as I had with release 20181231) and there were only 6 CRC fails. The read-counter had an overflow already, now it is at 800 times, but it was at 58000 today already. ==> perfect!

So this release seems to have fixed the hw serial configuration :-)

p078_hwserial

Build:⋄ 20103 - Mega Libraries:⋄ ESP82xx Core 2.6.0-dev, NONOS SDK 3.0.0-dev(c0f7b44), LWIP: 2.1.2 PUYA support GIT version:⋄ mega-20190216 Plugins:⋄ 76 [Normal] [Testing] Build time:⋄ Feb 16 2019 03:24:39 Binary filename:⋄ ESP_Easy_mega-20190216_test_core_260_alpha_ESP8266_4096_VCC.bin

Thank you very much an best regards, D-t-M

fluppie commented 5 years ago

I'm running mega-20190216 Checksum (pass/fail): | 6336/0

Only a uptime of4 hours, but it has been running for 3 days. Boot: | Manual reboot (4) Reset Reason: | Hardware Watchdog