tuanpmt / espduino

ESP8266 network client (mqtt, restful) for Arduino
http://tuanpm.net/post/espduino
MIT License
382 stars 122 forks source link

ESPDuino hang on ESP:process #43

Open ccorderod opened 8 years ago

ccorderod commented 8 years ago

Hi. I do have a failry comply sketch making use of a SeedStudio TFT Touch screen, and a bunch of sensors on a mega 2560 using the ESP8266 to publish/subscribe on/from a mosquitto server using mqtt calls. The code runs for a few seconds/minutes (that is, the mega publish and receive from the bróker with no problem), and at some point hangs. I did trace the hang to ESP::process; it goes through the default case, write the value onto proto.buf, but does not return, Sometimes, after several minutes, the process continues (ie, it return to the main loop and run as normal), then stops again after a few seconds/minutes.

I have spend many days trying to understand what happen, but I am stuck at this point. Any clue?

Thank you and have a good day. Regards, Carlos

Namphibian commented 8 years ago

Hi Carlos.

The ESP chips can be unreliable at times. I have had a issue previously where soldering a 100nf cap across the terminals sorted it out. However I would suggest to get a watch dog timer in place. The ESP process can hang up the chip. I used the timer1 library and created a very simple Watch Dog Timer and when the ESP hangs I reset the whole lot. It works like a charm with some ESP arduino boards running now for 8 weeks.

Regards Neil

On Sun, Dec 6, 2015 at 7:59 PM, ccorderod notifications@github.com wrote:

Hi. I do have a failry comply sketch making use of a SeedStudio TFT Touch screen, and a bunch of sensors on a mega 2560 using the ESP8266 to publish/subscribe on/from a mosquitto server using mqtt calls. The code runs for a few seconds/minutes (that is, the mega publish and receive from the bróker with no problem), and at some point hangs. I did trace the hang to ESP::process; it goes through the default case, write the value onto proto.buf, but does not return, Sometimes, after several minutes, the process continues (ie, it return to the main loop and run as normal), then stops again after a few seconds/minutes.

I have spend many days trying to understand what happen, but I am stuck at this point. Any clue?

Thank you and have a good day. Regards, Carlos

— Reply to this email directly or view it on GitHub https://github.com/tuanpmt/espduino/issues/43.

ccorderod commented 8 years ago

Hi Neil. Thank you for the reply. I forgot to mention the ESP is alive and kicking. I do have an FTDI basic connected to ESP GPIO2 port, and the ESP is still receiving mqtt subsciptions while the mega is stuck (I monitor this through a PuTTY session). I am already using a capacitor with the ESP (found out it works much more stable during my first days playing with it, thank you). What is pissing me off is the fact the sketch hangs just there, and sometimes it continues (the MEGA DOES NOT restart), sometimes I do have to reset the whole thing. I was just looking at the WDT as a workaround, but the problem with this is I am going to miss some of the subscriptions sent by the broker. What the hell is the mega waiting for just at the end of ESP process??? :(. Thank you again. Have a great day. Carlos

Namphibian commented 8 years ago

Hi Carlos.

This is embedded electronics. I am doing exactly what you are doing but not using a Mega only using a raw ATMega328-pu no arduino anymore. I am not sure why this happens but there seems to be a communications problems between the ESP and arduino. Its like the serial does not receive the END OF LINE symbols for a long while. If you broker supports it make sure that delivery options allows to ensure that the messages can be queued till the device comes back up.

You will need to get the WDT in at some point. Things will go wrong and it is best to deal with it like this then get frustrated about the things hanging. I am currently reviewing the code and it is pretty good but it needs to be tested and worked a bit more.

I will review the code a bit and see if I can find any smoking guns.

Kind Regards Neil

On Sun, Dec 6, 2015 at 10:34 PM, ccorderod notifications@github.com wrote:

Hi Neil. Thank you for the reply. I forgot to mention the ESP is alive and kicking. I do have an FTDI basic connected to ESP GPIO2 port, and the ESP is still receiving mqtt subsciptions while the mega is stuck (I monitor this through a PuTTY session). I am already using a capacitor with the ESP (found out it works much more stable during my first days playing with it, thank you). What is pissing me off is the fact the sketch hangs just there, and sometimes it continues (the MEGA DOES NOT restart), sometimes I do have to reset the whole thing. I was just looking at the WDT as a workaround, but the problem with this is I am going to miss some of the subscriptions sent by the broker. What the hell is the mega waiting for just at the end of ESP process??? :(. Thank you again. Have a great day. Carlos

— Reply to this email directly or view it on GitHub https://github.com/tuanpmt/espduino/issues/43#issuecomment-162305138.

ccorderod commented 8 years ago

Hi again Neil.

I think I found the "bug" and a workaround. I did add a few serial.print on espduino.cpp: void ESP::process() { char value; while(_serial->available()) { Serial.print("Entramos en ESP.process - while - "); Serial.print(_proto.dataLen); Serial.print(" - "); Serial.println(_serial->peek(), HEX);

and

void ESP::protoCompletedCb(void) { Serial.println("Entramos en protoCompletedCb"); PACKET_CMD _cmd = (PACKETCMD)_proto.buf; uint16_t crc = 0, argc, len, resp_crc; uint8_t _data_ptr; argc = cmd->argc; data_ptr = (uint8t)&cmd->args ; crc = crc16_data((uint8_t*)&cmd->cmd, 12, crc); Serial.print("while(argc--) "); Serial.println(argc);

Here are a dump of 2 frames: Entramos en ESP.process - while - 38 - 7E Entramos en ESP.process - while - 0 - A Entramos en ESP.process - while - 1 - 0 Entramos en ESP.process - while - 2 - D Entramos en ESP.process - while - 3 - 9 Entramos en ESP.process - while - 4 - 0 Entramos en ESP.process - while - 5 - 0 Entramos en ESP.process - while - 6 - 0 Entramos en ESP.process - while - 7 - 0 Entramos en ESP.process - while - 8 - 0 Entramos en ESP.process - while - 9 - 0 Entramos en ESP.process - while - 10 - 2 Entramos en ESP.process - while - 11 - 0 Entramos en ESP.process - while - 12 - C Entramos en ESP.process - while - 13 - 0 Entramos en ESP.process - while - 14 - 77 - w Entramos en ESP.process - while - 15 - 73 - s Entramos en ESP.process - while - 16 - 6F - o Entramos en ESP.process - while - 17 - 2F Entramos en ESP.process - while - 18 - 68 Entramos en ESP.process - while - 19 - 75 Entramos en ESP.process - while - 20 - 6D Entramos en ESP.process - while - 21 - 69 Entramos en ESP.process - while - 22 - 64 Entramos en ESP.process - while - 23 - 69 Entramos en ESP.process - while - 24 - 74 Entramos en ESP.process - while - 25 - 79 Entramos en ESP.process - while - 26 - 8 Entramos en ESP.process - while - 27 - 0 Entramos en ESP.process - while - 28 - 33 Entramos en ESP.process - while - 29 - 35 Entramos en ESP.process - while - 30 - 2E Entramos en ESP.process - while - 31 - 39 Entramos en ESP.process - while - 32 - 38 Entramos en ESP.process - while - 33 - 0 Entramos en ESP.process - while - 34 - 0 Entramos en ESP.process - while - 35 - 0 Entramos en ESP.process - while - 36 - D0 Entramos en ESP.process - while - 37 - 4E Entramos en ESP.process - while - 38 - 7F Entramos en protoCompletedCb while(argc--) 2 Received: topic=wso/humidity data=35.98 This frame is ok (it is a subscription from the mqtt bróker)

Entramos en ESP.process - while - 38 - 7E Entramos en ESP.process - while - 0 - A Entramos en ESP.process - while - 1 - 0 Entramos en ESP.process - while - 2 - D Entramos en ESP.process - while - 3 - 0 Entramos en ESP.process - while - 4 - 2 Entramos en ESP.process - while - 5 - 0 Entramos en ESP.process - while - 6 - C Entramos en ESP.process - while - 7 - 0 Entramos en ESP.process - while - 8 - 77 - w Entramos en ESP.process - while - 9 - 73 - s Entramos en ESP.process - while - 10 - 6F - o Entramos en ESP.process - while - 11 - 2F Entramos en ESP.process - while - 12 - 65 Entramos en ESP.process - while - 13 - 78 Entramos en ESP.process - while - 14 - 74 Entramos en ESP.process - while - 15 - 74 Entramos en ESP.process - while - 16 - 65 Entramos en ESP.process - while - 17 - 6D Entramos en ESP.process - while - 18 - 70 Entramos en ESP.process - while - 19 - 0 Entramos en ESP.process - while - 20 - 8 Entramos en ESP.process - while - 21 - 0 Entramos en ESP.process - while - 22 - 31 Entramos en ESP.process - while - 23 - 33 Entramos en ESP.process - while - 24 - 2E Entramos en ESP.process - while - 25 - 31 Entramos en ESP.process - while - 26 - 34 Entramos en ESP.process - while - 27 - 0 Entramos en ESP.process - while - 28 - 0 Entramos en ESP.process - while - 29 - 0 Entramos en ESP.process - while - 30 - AA Entramos en ESP.process - while - 31 - CA Entramos en ESP.process - while - 32 - 7F

This frame is NOT OK (some of the data at the beginning are missing) , and the argc value is above 15000!. This is why the mega get "stuck" for a while then continue.

the loop in protoCompletedCb: while(argc--){ len = _((uint16t)data_ptr); crc = crc16_data(data_ptr, 2, crc); data_ptr += 2; while(len --){ crc = crc16_data(data_ptr, 1, crc); data_ptr ++; } } takes forever doing nothing good.

I have updated the espduino.cpp code as follows:

void ESP::protoCompletedCb(void) { Serial.println("Entramos en protoCompletedCb"); PACKET_CMD _cmd = (PACKETCMD)_proto.buf; uint16_t crc = 0, argc, len, resp_crc; uint8_t _data_ptr; argc = cmd->argc; data_ptr = (uint8t)&cmd->args ; crc = crc16_data((uint8t)&cmd->cmd, 12, crc); Serial.print("while(argc--) "); Serial.println(argc); // check if frame is corrupt and argc does not make sense if (argc > 5) { INFO("ARDUINO: Invalid frame"); return; } while(argc--){ len = *((uint16t)data_ptr); crc = crc16_data(data_ptr, 2, crc); data_ptr += 2; while(len --){ crc = crc16_data(data_ptr, 1, crc); data_ptr ++; } } respcrc = (uint16t)data_ptr; if(crc != resp_crc) { INFO("ARDUINO: Invalid CRC"); return; }

No more hangs for the time being ;-). If you créate mqtt callbacks with more tan 5 params, update the patch as required.

Dear tuan, I do not consider myself good enough to propose this a THE fix, but it may be usefull for somebody hitting the same issue as me.

Neil, thanks again for your kindness. Regards, Carlos

Namphibian commented 8 years ago

@Carlos @tuanpmt I have been trying to extend the commands to include access to some of the WIFI API in the SDK. This piece of code is proving to be problematic in a lot of ways. I am doing some work on it over the next few weeks so will get back with some findings but there seems to be some safety checking missing here which will greatly improve the usability of this software.

ccorderod commented 8 years ago

Thank you Neil. Let me know if I may be of any help. I will be on holidays in a few days and will have time to invest on it. Regards, Carlos

bobcroft commented 8 years ago

Hi Carlos, Neil I just saw this thread and I am having similar problems. Carlos, has your patch proved successful over a longer period? I am a bit confused by the reference to callback parameters, that doesn't mean if I subscribe to more than 5 topics does it? Thanks I look forward to your feedback. Bob

CosminLazar commented 7 years ago

@ccorderod reviving an old thread, but it's never too late to say "thank you" - i was experimenting the same problems as you, and your fix seems to have dealt with them.

ccorderod commented 7 years ago

@CosminLazar Happy to help :-). Please note espduino has been discontinued by tuan. I am now rewriting my code with espwifi+pubsub libraries, both on github. @bobcroft apologies, I did miss your msg. Same comment as for Cosmin. Have a great day. Rgds.

bobcroft commented 7 years ago

ccorderod, which specific libraries on github are you using /writing please?

ccorderod commented 7 years ago

Hi @bobcroft.

https://github.com/bportaluri/WiFiEsp. Needs to update the esp firmware with AT firmware instead of espduino. http://pubsubclient.knolleary.net/

use a mega to dev and test; will be much easier ;-).

Cheers.

ccorderod commented 7 years ago

@bobcroft Hi Bob. I'm afraid WiFiEsp + PubSubClient are extremely unreliable together. wifiesp is compatible enough to compile, but I have not found a way to make my setup reliable enough to be in production :-(. Now looking at alternatives; any hint will be appreciated.