Closed ArJay60 closed 4 months ago
Hi @ArJay60
Could you supply some logging using port 23.
Check the wiki: https://github.com/rvdbreemen/OTGW-firmware/wiki/How-to-debug-the-OTGW-firmware
That will help analyse your situation.
And yes, you can reset using the GW=R command. But before you go that route let's try to analyse what going on.
Thanks Robert
OK. I will do that the next time I have the problem. Currently the logging is filled with the regular stuff because I did the manual reset. Regards Richard
Still waiting for it to fail. Will keep monitoring.
Once you have it. The. Share. Btw did you know there is a new release available? 0.10.3
There a whole lot of small thinks that got fixed after a year of collecting small patches.
Robert
Have the fault situation now. Below the log (but not from the moment It failed first. I don't have that).
If I do nothing only the 'Minute changed' and 'Uptime seconds' messages appear. I have, through the web interface of OTGW, asked once for the PIC Firmware. This resulted in the other line's.
I leave it in the faulty situation right now and will wait for you because you might have other things that you want to investigate.
Log:
09:01:00.426068 ( 16312| 14104) doTaskMinute( 279): Minute changed: 09:01:04.263059 ( 16312| 14104) sendMQTTupti( 157): Uptime seconds: 1618299 09:02:00.426395 ( 16152| 14104) doTaskMinute( 279): Minute changed: 09:03:00.427296 ( 16120| 14104) doTaskMinute( 279): Minute changed: 09:04:00.426309 ( 16072| 14104) doTaskMinute( 279): Minute changed: 09:05:00.426012 ( 16176| 14104) doTaskMinute( 279): Minute changed: 09:06:00.426629 ( 16152| 14104) doTaskMinute( 279): Minute changed: 09:06:04.264146 ( 16152| 14104) sendMQTTupti( 157): Uptime seconds: 1618598 09:07:00.426776 ( 16088| 14104) doTaskMinute( 279): Minute changed: 09:07:05.122146 ( 16136| 14104) apifirmwaref( 134): API: apifirmwarefilelist() 09:07:05.124318 ( 14768| 13456) apifirmwaref( 141): dirpath=/pic16f1847 09:07:05.143614 ( 14528| 13456) apifirmwaref( 147): dir.fileName()=diagnose.hex 09:07:06.154324 ( 15152| 14104) apifirmwaref( 160): version=2.1 09:07:06.156924 ( 13768| 12808) GetVersion ( 16): GetVersion opening /pic16f1847/diagnose.hex 09:07:06.221590 ( 13808| 12808) apifirmwaref( 164): GetVersion(/pic16f1847/diagnose.hex) returned []
09:07:06.223772 ( 13888| 12808) apifirmwaref( 147): dir.fileName()=diagnose.ver 09:07:06.225142 ( 13888| 12808) apifirmwaref( 147): dir.fileName()=gateway.hex 09:07:06.234684 ( 13808| 12808) apifirmwaref( 160): version=6.5 09:07:06.236940 ( 13768| 12808) GetVersion ( 16): GetVersion opening /pic16f1847/gateway.hex 09:07:06.409213 ( 15152| 14104) apifirmwaref( 164): GetVersion(/pic16f1847/gateway.hex) returned [6.5]
09:07:06.412582 ( 13888| 12808) apifirmwaref( 147): dir.fileName()=gateway.ver 09:07:06.414136 ( 13888| 12808) apifirmwaref( 147): dir.fileName()=interface.hex 09:07:07.424292 ( 15152| 14104) apifirmwaref( 160): version=2.0 09:07:08.426771 ( 13768| 12808) GetVersion ( 16): GetVersion opening /pic16f1847/interface.hex 09:07:08.504889 ( 14480| 12808) apifirmwaref( 164): GetVersion(/pic16f1847/interface.hex) returned []
09:07:08.507818 ( 13888| 12808) apifirmwaref( 147): dir.fileName()=interface.ver 09:07:08.508553 ( 13888| 12808) apifirmwaref( 178): filelist response: [{"name":"diagnose.hex","version":"2.1","size":9196},{"name":"gateway.hex","version":"6.5","size":26124},{"name":"interface.hex","version":"2.0","size":10488}]
@ArJay60 Well, the ESP firmware works, but there are not responses or message from the PIC anymore.
So could you try the following: 1) Logon to port 23 (where you got the logging) 2) Hit the 'p' to do a manual reset of the pic 3) See if PIC is rebooted.
Let me know the result please.
Thanks Robert
Reboots with 'p'. below the log directly after booting:
8:16:00.899064 ( 16064| 14104) doTaskMinute( 279): Minute changed:
p18:16:09.917746 ( 16064| 14104) sendMQTTupti( 157): Uptime seconds: 1737955
p
18:16:30.595982 ( 15824| 14104) handleDebug ( 22): Manual reset PIC
18:16:30.765409 ( 15744| 14104) detectPIC ( 122): ETX found after reset: Pic detected!
18:16:30.770059 ( 13136| 12160) handleDebug ( 22): Manual reset PIC
18:16:31.937884 ( 15824| 14104) detectPIC ( 122): ETX found after reset: Pic detected!
18:16:32.027417 ( 15904| 14104) fwreportinfo(1967): Callback: fwreportinfo
18:16:32.029910 ( 13216| 12160) fwreportinfo(1970): Current firmware version: 6 .5
18:16:32.031134 ( 13216| 12160) fwreportinfo(1972): Current device id: pic16f18 47
18:16:32.031874 ( 13216| 12160) fwreportinfo(1975): Current firmware type: gate way
18:16:32.032675 ( 13216| 12160) processOT (1701): Current firmware version: 6 .5
18:16:32.033456 ( 13216| 12160) processOT (1703): Current device id: pic16f18 47
18:16:32.034155 ( 13216| 12160) processOT (1705): Current firmware type: gate way
18:16:32.146972 ( 10976| 10296) processOT (1676): Boiler BC0193D00 (9)[MsgID= 25][READ_ACK ]>Tboiler = 61.00 °C
18:16:32.306859 ( 12968| 12160) processOT (1676): Thermostat T101815A3 (9)[MsgID= 24][WRITE_DATA ]>Tr = 21.64 °C
18:16:33.146211 ( 13664| 12808) processOT (1676): Boiler BD01815A3 (9)[MsgID= 24][WRITE_ACK ] Tr = 21.64 °C
18:16:33.311065 ( 12296| 11512) processOT (1676): Thermostat T80000200 (9)[MsgID= 0][READ_DATA ]>Status = Master [-D---W--]
18:16:34.149338 ( 12320| 11512) processOT (1676): Boiler B40000200 (9)[MsgID= 0][READ_ACK ]>Status = Slave [--------]
18:16:34.323290 ( 12968| 12160) processOT (1676): Thermostat T10010600 (9)[MsgID= 1][WRITE_DATA ]>TSet = 6.00 °C
18:16:35.182797 ( 13640| 12808) processOT (1676): Boiler BD0010600 (9)[MsgID= 1][WRITE_ACK ] TSet = 6.00 °C
18:16:35.318330 ( 15848| 14104) processOT (1676): Thermostat T00110000 (9)[MsgID= 17][READ_DATA ] RelModLevel = 0.00 %
18:16:36.153529 ( 14304| 13456) processOT (1676): Boiler BC0110000 (9)[MsgID= 17][READ_ACK ]>RelModLevel = 0.00 %
18:16:36.335057 ( 13640| 12808) processOT (1676): Thermostat T80190000 (9)[MsgID= 25][READ_DATA ] Tboiler = 0.00 °C
18:16:37.178881 ( 12968| 11512) processOT (1676): Boiler B401933CC (9)[MsgID= 25][READ_ACK ]>Tboiler = 51.80 °C
18:16:37.317545 ( 15848| 14104) processOT (1676): Thermostat T00090000 (9)[MsgID= 9][READ_DATA ] TrOverride = 0.00 °C
18:16:37.460332 ( 15176| 14104) processOT (1676): Boiler BF0090000 (9)[MsgID= 9][UNKNOWN_DATA_ID ]-TrOverride = 0.00 °C
Good, that confirms that somehow the PIC needs a reboot. I can add that to the firmware.
When I have the time I will ping you to test it for me.
Robert
Now the question becomes why the PIC needs a reset. I have not heard from other people that they experience the PIC getting stuck.
Great question @hvxl i was wondering about that too. Any idea how we could find out what is going on in his case?
I see two possible ways to think about this:
@ArJay60 could you possibly run a long term log to find the situation that seems to cause this fault of the PIC. Another option is to replace the PIC, you can order one a nodoshop, that way the hardware issue can be ruled out.
Let me know if you are willing to find the underlying problem.
As I said, the easiest thing to try first is to reflash the PIC firmware. If the problem still occurs after that, you can try some more involved avenues.
Can I reflash the PIC while installed? By which instructions?
That's simple:
It takes a few seconds, and the firmware should be installed, the screen will refresh. And you should be back to the default webpage of OTGW and the PIC should be installed with new firmware.
If you want to observe the upgrade, just logon to port 23 and observe the firmware flash process in the log.
Done that (PIC upgrade was succesful ). I will monitor the status of the PIC/OTGW for the next couple of days to see if it continuous to work properly. I will report regularly over here.
Seems to solved with a reflash of the PIC. Will close and keep fingers crossed that problem is gone.
With some time in between (most of the time around a day or two) OTGW firmware seems to loose connection with PIC. As this happens the Home page of the OTGW webserver does only show header and buttonbar. So no single row with the name of the sensor and the value of that sensor. Also values are not passed through over MQTT to HA.
While in error on the homepage I am able to read PIC firmware versions using Advanced - PIC Firmware of the OTGW Webserver. So communication with the PIC doesn't seem to be the issue. Values here are: Firmware name Version Size diagnose.hex 2.1 9196 gateway.hex 6.5 26124 interface.hex 2.0 10488
Resetting PIC with the reset button on the OTGW PCB (I am using the one sold in the Nodo shop) combined with ESP8266 for MQTT communication with HA) makes that within a few seconds the values are available again on the home page of the OTGW webserver Home page and within HA.
What goes wrong? Is is possible to monitor on the OTGW software on failing to retrieve sensor values and then reset the PIC? Is it possible to do a reset of the PIC remotely through OTGW (monitor myself if I still retrieve values and then do a remote reset from out of HA)?
Firmware Version0.10.2+50c3ed2 PIC Availabletrue PIC Firmware Version6.5 PIC Device IDpic16f1847