nielsonm236 / NetMod-ServerApp

Reprogramming the Web_Relay_Con V2.0 HW-584 Network Module
73 stars 23 forks source link

devices reset and loses input output parameters #87

Closed atilim07 closed 2 years ago

atilim07 commented 3 years ago

device resets device name ports are corrupted loses input output parameters

nielsonm236 commented 3 years ago

The only time I've seen this is when the power supply is not adequate to prevent the built-in power reset from triggering intermittently. Intermittent resets, if they occur during writes to the EEPROM, will cause the corruption you describe. So a little more information is needed to help determine the source of the resets.

nielsonm236 commented 3 years ago

Had any luck with the suggestions?

gabortamasko commented 3 years ago

Hi, the same occured. Please help. I use 4 pins as output and actuate (via n ch mosfet) a 4 port relay. Power delivery is sufficient, added few 1000uF caps. The relay board is a 12V board, max draw is 200mA when all relays are on.

Still, if I change any output state, that would pull a relay via n-mosfet the board behaves crazy, pulls those 4 outputs high, and forgets pin type settings like output and input, invert state boot state and timer settings. It does not forget any other settings tho... None of the output do have "retain" settings, so there should be no eeprom write by my logic... except if any problems do occur and it was logged localy.

BUT, and heres is the but. If I setup the system again, there is a 25% chance, that from there on it will work fine for days. No switching problems whatsoever.

Here is some logs for you:

Link Error Statistics 31 0000001099 32 0000000435 33 0000000000 35 0000000000

0000000083 Dropped packets at the IP layer 0000000329 Received packets at the IP layer 0000000199 Sent packets at the IP layer 0000000002 Packets dropped due to wrong IP version or header length 0000000000 Packets dropped due to wrong IP length, high byte 0000000000 Packets dropped due to wrong IP length, low byte 0000000000 Packets dropped since they were IP fragments 0000000000 Packets dropped due to IP checksum errors 0000000000 Packets dropped since they were not ICMP or TCP 0000000000 Dropped ICMP packets 0000000000 Received ICMP packets 0000000000 Sent ICMP packets 0000000000 ICMP packets with a wrong type 0000000000 Dropped TCP segments 0000000249 Received TCP segments 0000000202 Sent TCP segments 0000000000 TCP segments with a bad checksum 0000000000 TCP segments with a bad ACK number 0000000001 Received TCP RST (reset) segments 0000000004 Retransmitted TCP segments 0000000000 Dropped SYNs due to too few connections avaliable 0000000000 SYNs for closed ports, triggering a RST

nielsonm236 commented 3 years ago

This is very puzzling. I don't see anything particularly wrong in the Link Error stats except stat 33 should have a "6" in the fifth digit from the right. That digit would indicate the revision level of the ENC28J60. The fact that it is reporting "0" is odd ... not sure how that could happen ... and wondering if it is a clue. Since I haven't seen anyone else report this kind of problem unless there was a power issue, let me ask a few questions: 1) Since the relays run at 12V, you must have a separate 5V power supply for the Network Module. Is that correct? 2) Is the 5V supply derived from the 12V supply used for the relays, perhaps with a 5V regulator? 3) How often do you switch a relay? 4) Are you switching relays with Browser / REST commands? Or are you using MQTT? Let me know the above and I will think some more on this. Mike

nielsonm236 commented 3 years ago

One more question: Please let me know the part number for the n-mosfet you are using. Thanks

gabortamasko commented 3 years ago

BSP308 SOT223 https://asset.conrad.com/media10/add/160267/c1/-/en/000152901DS01/datasheet-152901-infineon-technologies-bsp308-mosfet-1-n-channel-18-w-sot-223.pdf

gabortamasko commented 3 years ago

1) yes, I use a separate 5V buck for the white board, and even use a 3.3V ldo for optocuplers to monitor external 12v voltage sources (if they are on or off) 2) A simple mini360 style buck converter , preset for 5V 3) I switch it rearly. But when it is stable I can switch it in any combination. In optimal working conditions, none of the relays are "HIGH" (why let them consume power if I know how will it be used in the 99.999% of the time) 4) I use browser only firmware Thanks.

Currently I still figuring out maybe my voltage dips for a short time, that makes the board to go below 4V-5V. Will share my findings....

nielsonm236 commented 3 years ago

An initial comment regarding the mosfet. If I'm reading the spec correctly, looking at the typical transfer characteristics on page 6, the drain current will only be 2 to 3 mA with the gate at 2.5 to 3V (the output voltage on the Network Module pin). I don't know the rest of your circuit, for instance if the mosfet drain is directly operating a relay coil. If it is, 2 to 3 mA is very low for a relay coil and may cause the relay to "chatter", generating power supply noise. But again, I don't know what your circuit looks like. If there is some additional coil driver then the mosfet drain current is probably fine.

Next: Is the input to the 5V buck converter the same 12V that runs the relays? If yes, an experiment you could try is to temporarily power the Network Module from a completely separate 5V power source ... perhaps a "wall wart" power supply that provides at least 500mA. That experiment is only to make sure we aren't somehow getting power noise on 5V caused by relay operation creating surge current (and a momentary drop in 12V).

I'm very interested in helping to solve this.

Mike

gabortamasko commented 3 years ago

Im using this relay board: https://imgaz1.staticbg.com/thumb/large/oaupload/banggood/images/94/75/8657adaa-8547-4c9b-88f1-d0ed7a39e8c4.jpg.webp but a 12V variant, Its inputs needs pulled to gnd. So there is minimal current usage at 3.3V even for n mosfet.

Regarding a the second suggestion, will try out, thanks.

nielsonm236 commented 3 years ago

Agreed - the mosfet drain current is not an issue. So I do think we are back to making sure the Network Module power is very clean. The STM8S processor power reset circuit is very sensitive.

My thoughts while power is investigated: Despite my suspicion that there is a power problem I am still puzzled that you are losing pin type, invert, and timer settings. Pin type and invert are stored in EEPROM on the STM8S. Timer settings are stored in Flash on the STM8S. I don't think these would be affected by a simple STM8S reset, even if that reset occurred intermittently. I am wondering if the buck converter is a bit noisy. Sometimes they produce a very high frequency noise on their power output ... a 1000uF cap will help with surges, but typically won't help with high frequency noise. So if we find that a separate 5V power supply works (without the buck converter). then maybe we can look at how to quiet down the output of the buck converter and get you back to your original design.

Mike

gabortamasko commented 3 years ago

Some hours later.

I disconnected all external diveces to minimise power fluctuations, Added extra caps for 5V Measured with scope, the 5v buck has 70mv p2p noise,, since the white board uses 3.3V LDO it has a good noise suppressing for external 5v, so this should not be an issue.

I just realized, the bug that makes the board go crazy can be reproduced.

I even disconnected the relay board.... yes there is nothing on the output (except the mosfets).

Cheat: By switching back to the configuration tab in browser, the settings are still there, just have to press save again. there is a 75% chance that after config save, the board bahaves crazy (needs iocontrol page interaction) . Sw reboot does not help.

Incase of 25% when the board works great, I can switch around as much I want.

here are the stats:

Link Error Statistics 31 0000000015 32 0000000040 33 0000000000 35 0000000000

Network Statistics Values shown are since last power on or reset

0000000000 Dropped packets at the IP layer 0000000051 Received packets at the IP layer 0000000042 Sent packets at the IP layer 0000000000 Packets dropped due to wrong IP version or header length 0000000000 Packets dropped due to wrong IP length, high byte 0000000000 Packets dropped due to wrong IP length, low byte 0000000000 Packets dropped since they were IP fragments 0000000000 Packets dropped due to IP checksum errors 0000000000 Packets dropped since they were not ICMP or TCP 0000000000 Dropped ICMP packets 0000000000 Received ICMP packets 0000000000 Sent ICMP packets 0000000000 ICMP packets with a wrong type 0000000000 Dropped TCP segments 0000000054 Received TCP segments 0000000045 Sent TCP segments 0000000000 TCP segments with a bad checksum 0000000000 TCP segments with a bad ACK number 0000000000 Received TCP RST (reset) segments 0000000002 Retransmitted TCP segments 0000000000 Dropped SYNs due to too few connections avaliable 0000000000 SYNs for closed ports, triggering a RST

gabortamasko commented 3 years ago

Addiotional testcase After configuration setup, and after a power off/on, the issue did not occure after a fresh bootup. Tried 10+ times.

nielsonm236 commented 3 years ago

So if I understand your comments you are seeing problems when you use two different tabs in the same browser to make changes (a Save causes the change process to run even if there is nothing to save). I must admit I never tried that. In my manual I recommend avoiding using multiple browsers on multiple computers to make and Save changes, as I know this can cause confusion and settings corruption. But I never tried two tabs in the same Browser. I will try this here. In the meantime, I recommend using only one tab and to switch between settings pages within that tab.

Let me know if what I'm saying here makes sense with regard to what you saw. In all my testing I always used one tab in one browser when changing settings, and never experienced the problem you are seeing. As an FYI, when a Save is done the code will restart the controller when needed ... so no power off/on should be required.

Mike

nielsonm236 commented 3 years ago

An additional comment: I will bet you see the issue using two tabs because they are treated as independent browser sessions, and each session is caching the browser state, which includes the javascript variables. Then a Save in one tab overwrites the Save done by the other tab. So you end up with old settings written to the controller.

If this is the case about all I can do is add a caution to the manual not to use multiple tabs for Saving settings. But first I want to try to reproduce here.

Mike

gabortamasko commented 3 years ago

Yes, you are correct. Same browser 2 tabs. I wanted to point out, that with 2 tabs it is easy to reproduce => you may find the issue. Originaly I used only 1 tab. Later I was using 2 tabs, becasue that way it was easy to reconfigure, since there is no export/import function.

Yet, as I noted it earlier, the problem does not always appear when using multiple tabs.

Please try to debug what casues the data loss.

Thanks

nielsonm236 commented 3 years ago

Which OS and Browser are you using?

gabortamasko commented 3 years ago

Win10, Chrome or Firefox for some reason when using ffox i cant save configuration, but iocontrol does work on chrome there is no problem. Chrome 95.0.4638.69 firefox 94.0.1

nielsonm236 commented 3 years ago

It is middle of the night where I am (around 3AM), but I did a short test and narrowed the issue a bit. Using Win10 with Chrome I opened two tabs to the same controller. Tab1: Configuration, Tab2: IOControl. a) I find that if I click Save on the Configuration in Tab1 (regardless of whether I changed the Configuration or not), then go to Tab2 and change the state of Output1 (and Save), then the Configuration is reset to default. b) If I Save the Configuration in Tab1, then change the state of any other Output in Tab2 (and Save) then the Configuration is NOT lost, and any subsequent change to Ouput1 does not cause loss of Configuration. c) If I Save the Configuration in Tab1, then change the state of Output1 and ALSO change the state of any other Output THEN Save, the Configuration is NOT lost. d) If Configuration is "lost" by a change to Output1 in Tab2, I can return to Tab1 (which still shows the good Configuration) and simply click Save ... and the Configuration is restored. e) If IO1 is defined as an Input and IO2 is defined as an Output, then the above scenario shifts to IO2, ie, a change to the IOControl state of Output2 causes the problem. e) The above "loss of Configuration" does not occur if all actions are taken from within the same Tab.

The above explains why it appears to you to be intermittent. It actually isn't intermittent, and is very repeatable. BUT, it requires a change to the IOControl state of the first Output in Tab2 after Saving the Configuration in Tab1.

So, there is something unique about changing the state of the lowest order Output in a 2 Tab scenario, and very likely includes some aspect of caching in the two Tabs. This will be very difficult to track down but I will review the code to see if I can find it.

To make your situation stable I recommend using only 1 Tab in 1 Browser to make Configuration AND IOControl changes. If I can fix it I will, but if I can't find it I can't fix it .... and using 1 Tab in 1 Browser will become a restriction.

Mike

gabortamasko commented 3 years ago

Thank you for your effort!

nielsonm236 commented 3 years ago

After looking at this some more I don't have a solution. I believe I am going to have to list a restriction as follows: "Use a single browser and a single browser tab to make and save changes within the Configuration and IO Control pages. Use of multiple browser sessions or multiple browser tabs to make and save changes in the Configuration and IO Control pages can cause settings to be lost."

The reason this becomes complicated to fix is that the firmware uses the IP Address and Port Number to identify a session. Javascript is used to store the data within a session, then transmit that data when Save is clicked. If more than one session (ie, Browser or Tab within a Browser) is open using the same IP Address and Port number the Javascript in those sessions could store stale information, and when Save is clicked that stale information could update the settings incorrectly. Within the limited resources of the HW-584 I don't have a means of identifying individual browser sessions and storing the variables associated with each browser session.

FYI you can still have up to 5 Browser sessions (4 if using MQTT) on a single computer or across multiple computers, but only 1 of those sessions should be used to make and Save changes. All the others should be used only for monitoring status.

I will add this restriction to the Manual.

gabortamasko commented 3 years ago

Ok, thanks. Did you find the issue why only some settings are lost?

nielsonm236 commented 3 years ago

Yes - The Configuration page has a broader set of Javascript variables, only some are shared with the IO Control page. So, you can get a mix of updated and not-updated variables when the Save occurs.

nielsonm236 commented 2 years ago

See Issue #91. Might be related. I'm working on it.

nielsonm236 commented 2 years ago

I believe Issue #91 is in fact what you were seeing here, although I still caution that using multiple browser windows to make changes can lead to confusion. I have resolved the firmware issues subject to additional testing.

nielsonm236 commented 2 years ago

Addressed in Code Revision 20220205 1645 You should still avoid using multiple browser windows to Save IOControl or Coniguration changes. If you are using one browser window for changes (using the buttons to move between the IOControl and Configuration page) then you can use other browser windows to monitor ... but don't save changes in those other windows.