milesburton / Arduino-Temperature-Control-Library

Arduino Temperature Library
https://www.milesburton.com/w/index.php/Dallas_Temperature_Control_Library
969 stars 487 forks source link

ds.getTempCByIndex() in ASYNC mode fails when interact with UDP stack #188

Closed pberna67 closed 2 years ago

pberna67 commented 3 years ago

Dear Developer

I found a major problem in your library as it is integrated with the UDP stack. In the example I am attaching, an ESP32 client (NINA W102) is connected to a local DS18B20 sensor, and through the wifi network to a UDP server (a ESP32 board with NINA W102) to which a DS18B20 sensors are connected. Every 10 seconds the local temperature sensor and a remote temperature sensor are read and stored in the Temperatures array []. The temperatures are printed on the serial.

The local temperature measurement is performed asynchronously by configuring the sensor with the function

ds.setWaitForConversion (0);

the measurement of the local temperature is performed with the function

ds.getTempCByIndex ();

after 2 seconds because using the function

ds.isConversionComplete ();

it has been noticed that it returns immediately without waiting the 750ms (max) for the end of the conversion.

As you can see from the example, if remote sensors are excluded, the local sensor responds correctly and the local temperature is measured correctly, however if the temperature of the remote sensors is also read, the local sensor is no longer read correctly, the ds. getTempCByIndex () returns with a value of -127.

To reproduce the problem, launch the program TemperatureMonitor7MQTTClient_crash_1 (a reduced version of the full program) by assigning to the variable

EnableRemoteSensor = true

The console reports the following result:

23:02:04.700 -> ⸮------- Multisensor temperature Acquisition System ---------- 23:02:04.700 -> Starting ITimer0 OK, millis() = 30 23:02:04.700 -> DayTimerSec: 0 23:02:04.700 -> 23:02:04.700 -> Number of local temperature devices found...: 1 23:02:04.700 -> DS12B80 address: 282F5D07D6013C62 23:02:04.734 -> Connecting to WiFi network: Vodafone-27586605 23:02:04.870 -> Waiting for WIFI connection... 23:02:05.074 -> WiFi connected On event IP address: 192.168.10.52 23:02:05.074 -> WiFi connected On event IP address: 192.168.10.52 23:02:14.694 -> Start measurements of the temperatures set... 23:02:14.694 -> Local temperature conversion start time: 10030 23:02:14.730 -> Send a request to read remote temperature sensor 23:02:14.730 -> READT command sent from: 192.168.10.52, port 1234 23:02:15.309 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.37 23:02:16.706 -> Local temperature value: -127.00 23:02:16.706 -> Local temperature conversion stop time: 12032 23:02:16.706 -> Local software Timer time: 00:00:12 23:02:16.706 -> TEMP1: -127.00°C <------- Local Temperature (wrong result) 23:02:16.706 -> TEMP2: 19.37°C <------- Remote temperature OK 23:02:24.700 -> Start measurements of the temperatures set... 23:02:24.700 -> Local temperature conversion start time: 20030 23:02:24.700 -> Send a request to read remote temperature sensor 23:02:24.700 -> READT command sent from: 192.168.10.52, port 1234 23:02:25.214 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.37 23:02:26.707 -> Local temperature value: -127.00 23:02:26.707 -> Local temperature conversion stop time: 22032 23:02:26.707 -> Local software Timer time: 00:00:22 23:02:26.707 -> TEMP1: -127.00°C 23:02:26.707 -> TEMP2: 19.37°C

if you assign the value false to

EnableRemoteSensor = false

you will see:

23:07:06.571 -> ------- Multisensor temperature Acquisition System ---------- 23:07:06.571 -> Starting ITimer0 OK, millis() = 30 23:07:06.571 -> DayTimerSec: 0 23:07:06.571 -> 23:07:06.571 -> Number of local temperature devices found...: 1 23:07:06.571 -> DS12B80 address: 282F5D07D6013C62 23:07:06.605 -> Connecting to WiFi network: Vodafone-27586605 23:07:06.707 -> Waiting for WIFI connection... 23:07:06.944 -> WiFi connected On event IP address: 192.168.10.52 23:07:06.944 -> WiFi connected On event IP address: 192.168.10.52 23:07:16.554 -> Start measurements of the temperatures set... 23:07:16.554 -> Local temperature conversion start time: 10030 23:07:18.605 -> Local temperature value: 20.50 <------- Local Temperature (right result)
23:07:18.605 -> Local temperature conversion stop time: 12058 23:07:26.554 -> Start measurements of the temperatures set... 23:07:26.554 -> Local temperature conversion start time: 20030 23:07:28.590 -> Local temperature value: 26.37 <------- Local Temperature (right result)
23:07:28.590 -> Local temperature conversion stop time: 22058

If you don''t have a second board to install the remote server temperature, doesn't netter, you will see the similar result

More detail: ESP32 board package Version 1.0.5-rc4 DallasTemperature lib Version 3.9.0

TemperatureServer.zip TemperatureMonitor7MQTTClient_crash_1.zip

RobTillaart commented 3 years ago

Only had a quick glance at the code and do not pretend to understand all of it. Furthermore I do not have the hardware and time to reconstruct the problem.

That said, there is a loop where you request all temperatures and I recall that ESP's can give problems with WIFI (a.o) when code is blocking "too long". Solution is to put a call to yield() wherever blocking code is expected. yield() takes care that waiting processes in the background get the attention they need.

Around line 152 could be a candidate to insert a call to yield() Can you give it a try?

Possibly there are more similar loops to add yield() to,

T_MT_STATES isTemperatureMeasurementState(void)
{     
  bool completed = true;      
  float t;

    if (LocalMeasureState == MEASURING)
    {            
      // completed = ds.isConversionComplete();      

      if (TIMEOUT_EVENT(2))      
      { 
        for (int i = 0; i<dsDeviceCount; i++)
        {
          yield():  // <<<<<<<<<<<<<<<<<<<<<<
          t = ds.getTempCByIndex(i);    // bug ! it doesn't return the measure
          //t = ds.getTempCByIndex(i); // if read 2 time then work, but if the program runs for hours the return value is frozen,
                                        // to a specific value that doesn't change if the temperature change.
          Temperatures[i] = t;
        }  

        Serial.print("Local temperature value: ");  
        Serial.println(t);
        Serial.print("Local temperature conversion stop time: ");  
        Serial.println(millis());

        LocalMeasureState = COMPLETED;                

      } // if completed
    } // if MeasuringTemperature

    return(LocalMeasureState);
}
pberna67 commented 3 years ago

Dear Rob

I tried to insert the statement yield() in the line you indicated but there are no improvements.. For your info there are no blocking points or long loops in my program.

I took into account your suspicions and managed to schedule the reading activities of the remote sensor (only 1) and the reading activity of the local sensor (only 1) in two disjoint time windows. In this way, the local temperature sensor is read "exclusively" without any other activity being scheduled at the same time. Even with this change there is no benefit

The functioning of the program, before the modification was:

1) The reading of the local and remote temperature sensor is required, simultaneously invoking the functions

  TemperatureMeasurement_start ();
  RemoteSensorsGroupTemperature_start ();

see section of the program in the main loop:

if (SamplingEvent)
    {
      SamplingEvent = false;
      Serial.println ("Start measurements of the temperatures set ...");
      Serial.print ("Recording status:");
      Serial.println (RecordingState? "Recording": "Stop");
      TemperatureMeasurement_start ();
      RemoteSensorsGroupTemperature_start ();
    }

2) wait for the completion of the readings of the local and remote sensors (if one of the sensors does not respond, a timeout is triggered) and the following functions

isTemperatureMeasurementState ()
isRemoteSensorGroupTemperatureState ()

return always with the COMPLETED status but the temperature is set at -127 ° C

The operation of the new program TemperatureMonitor7MQTTClient_crash_1_2 attached. is, instead, the following:

1) The reading of the local temperature sensor only is started

TemperatureMeasurement_start ();

which invokes `ds.requestTemperatures ();`

2) Then the following function is called periodically (in the main loop)

isTemperatureMeasurementState ()

which returns with the COMPLETED state after 2 seconds and then it invokes invokes the function

t = ds.getTempCByIndex (i);

to read the temperature from the local sensor, only at the end of the local sensor reading

3) Then the function is called RemoteSensorsGroupTemperature_start ()

to read the remote sensor

4) Wait for the response from the UDP server by invoking the function

isRemoteSensorGroupTemperatureState ()

which returns COMPLETED if the remote server replies or after the two communication attempts have failed (in this case the temperature is set to -127

Log result (with yield();)

20:41:18.970 -> Number of local temperature devices found...: 1 20:41:19.004 -> DS12B80 address: 282F5D07D6013C62 20:41:19.004 -> Connecting to WiFi network: Vodafone-27586605 20:41:19.106 -> Waiting for WIFI connection... 20:41:19.345 -> WiFi connected On event IP address: 192.168.10.52 20:41:19.345 -> WiFi connected On event IP address: 192.168.10.52 20:41:28.977 -> Start measurements of the temperatures set... 20:41:28.977 -> Local temperature conversion start time: 10030 20:41:30.993 -> Local temperature value: 20.94 <--- first time the local T is OK 20:41:30.993 -> Local temperature conversion stop time: 12058 20:41:30.993 -> Send a request to read remote temperature sensor 20:41:30.993 -> READT command sent from: 192.168.10.52, port 1234 20:41:31.607 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.37 20:41:31.607 -> Local software Timer time: 00:00:12 20:41:31.607 -> TEMP1: 20.94°C <--- first time the local T is OK 20:41:31.607 -> TEMP2: 19.37°C <-- Remote temperature is always OK 20:41:38.967 -> Start measurements of the temperatures set... 20:41:38.967 -> Local temperature conversion start time: 20030 20:41:40.976 -> Local temperature value: -127.00 <--- from now on local T is NOK 20:41:40.976 -> Local temperature conversion stop time: 22032 20:41:40.976 -> Send a request to read remote temperature sensor 20:41:40.976 -> READT command sent from: 192.168.10.52, port 1234 20:41:41.148 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.37 20:41:41.148 -> Local software Timer time: 00:00:22 20:41:41.148 -> TEMP1: -127.00°C <--- from now on local T is NOK 20:41:41.148 -> TEMP2: 19.37°C <-- Remote temperature is always OK ....

If you don't have an ESP32 board I can send it to you, for free 1 or 2 NINA boards. If you don't have time I can support you in the analysis and in testing If you quick need to understand the program we can have a skype meeting I can further reduce the complexity of the program because I've noticed that the problem can be triggered only with 1 remote and only 1 local sensor.

The UDP protocol is very simple. I send a READT string to the remote server to a specific port, the server answer with the last temperature acquires. If the UDP server doesn't answer in within REMOTE_SENSOR_TIMEOUT_MS the request command READT is sent for TX_RETRAY_N times. If no answer occurs the Temperatures[] returns with -127°C otherwise it return the temperature reported by the server.

A) I would like to know how you manage the DS 1wire protocol timings. Do you use software delays or hardware timer ?

B) I use the interrupt Timer routine to manage events and software timers , but it is very simple and short routine. Can this disturb your software loops?

void IRAM_ATTR TimerHandler0(void)
{
    DayTimerSec++;

    if (!SamplingEvent)
      SamplingEvent = (DayTimerSec % TEMPERATURE_SAMPLING_TIME_S) == 0;

    if (SwTimer1 > 0) SwTimer1--;
    if (SwTimer2 > 0) SwTimer2--;
} 

C) Do you think this can create problem to your library ?

TemperatureMonitor7MQTTClient_crash_1_2.zip


I have reduced the complexity of the program . In the further TemperatureMonitor7MQTTClient_crash_2.zip example the interrupt routine is removed, the remote sensor management is reduced to 1. Again if you set EnableRemoteSensor to "true" the local sensor reading is wrong (now it returns to a fixed value even if the temperature change).

TemperatureMonitor7MQTTClient_crash_2.zip

RobTillaart commented 3 years ago

1) I'm short in time 2) This is not my library, although I am a big fan and contributor of it

A) Timing is done by the OneWire Lib of Paul Stofregen Timing in this lib is done by delay's

B) Looks not 100% ok , Are the variables declared volatile?

void IRAM_ATTR TimerHandler0(void)
{
    DayTimerSec++;   <<< no test on end condition / if it is still valid / overflow

    if (!SamplingEvent)
   {
      SamplingEvent = (DayTimerSec % TEMPERATURE_SAMPLING_TIME_S) == 0;
      <<< if there is a sampling event it misses a % => 0 moment 
    }

    if (SwTimer1 > 0) SwTimer1--;
   else SwTimer1 = 0;  <<< to force state in case it is negative (or is it an unsigned ?)
    if (SwTimer2 > 0) SwTimer2--;
   else SwTimer2 = 0; <<< idem
} 

==> code below not tested , just a thought...

void IRAM_ATTR TimerHandler0(void)
{
  static uint32_t DTC = TEMPERATURE_SAMPLING_TIME_S;
  if (DTC != 0) 
  {
    DTC--;
  }
  else 
  {
    if (!SamplingEvent) 
    {
      SamplingEvent = true;
      DTC = TEMPERATURE_SAMPLING_TIME_S;
    }
  }

  if (SwTimer1 > 0) SwTimer1--;
  else SwTimer1 = 0;
  if (SwTimer2 > 0) SwTimer2--;
  else SwTimer2 = 0;
} 

C) I do no see how this code could interfere, however the OneWire code is time critical and should not be interrupted too long I guess.

Some questions about the hardware Q1: do the sensors have appropriate pull up resistors? Q2: are there long wires involved? Q3: do the sensors work when tested standalone?

pberna67 commented 3 years ago

Dear Rob

In the latest program reduction TemperatureMonitor7MQTTClient_crash_2.zip I have eliminated the interrupt service routine to simplify more. The problem is still present, but it appears different. The value read from the local temperature sensor is frozen to a certain value even if the temperature change

I mover the main loop the timing management

  if ((currentMillis-lastmils) > 1000)
    {
      lastmils = currentMillis;
      DayTimerSec++;

// if (!SamplingEvent)    it is not wrong. New Events can be managed only if the old one is consumed in the main . 
// There is no any even queue here. Because the event happen every 10 second I'm sure it is managed correctly, in // case it wasn't it would be acceptable it can be lost

      if (!SamplingEvent)    
        SamplingEvent = (DayTimerSec % TEMPERATURE_SAMPLING_TIME_S) == 0;

      if (SwTimer1 > 0) SwTimer1--;
      if (SwTimer2 > 0) SwTimer2--;
    }

16:48:39.255 -> Number of local temperature devices found...: 1 16:48:39.255 -> DS12B80 address: 282F5D07D6013C62 16:48:39.255 -> Connecting to WiFi network: Vodafone-27586605 16:48:39.394 -> Waiting for WIFI connection... 16:48:39.602 -> WiFi connected On event IP address: 192.168.10.52 16:48:39.602 -> WiFi connected On event IP address: 192.168.10.52 16:48:49.196 -> Start measurements of the temperatures set... 16:48:49.196 -> Local temperature conversion start time: 10010 16:48:51.208 -> Local temperature value: -127.00 <--- wrong value read from the local sensor 16:48:51.208 -> Local temperature conversion stop time: 12013 16:48:51.208 -> Send a request to read remote temperature sensor 16:48:51.208 -> READT command sent from: 192.168.10.52, port 1234 16:48:51.277 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 18.50 16:48:51.277 -> TEMP1: -127.00°C <--- wrong value read from the local sensor 16:48:51.277 -> TEMP2: 18.50°C 16:48:59.226 -> Start measurements of the temperatures set... 16:48:59.226 -> Local temperature conversion start time: 20020 16:49:01.248 -> Local temperature value: 25.00 16:49:01.248 -> Local temperature conversion stop time: 22049 16:49:01.248 -> Send a request to read remote temperature sensor 16:49:01.248 -> READT command sent from: 192.168.10.52, port 1234 16:49:01.282 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 18.44 16:49:01.282 -> TEMP1: 25.00°C <--- wrong value frozen read from the local sensor 16:49:01.282 -> TEMP2: 18.44°C 16:49:09.218 -> Start measurements of the temperatures set... 16:49:09.218 -> Local temperature conversion start time: 30030 16:49:11.235 -> Local temperature value: 25.00 <--- wrong value frozen read from the local sensor 16:49:11.271 -> Local temperature conversion stop time: 32059 16:49:11.271 -> Send a request to read remote temperature sensor 16:49:11.271 -> READT command sent from: 192.168.10.52, port 1234 16:49:11.546 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 18.50 16:49:11.546 -> TEMP1: 25.00°C <--- wrong value frozen read from the local sensor 16:49:11.546 -> TEMP2: 18.50°C

Q1: do the sensors have appropriate pull up resistors? in the reduced test program i attached there is only one local sensor and only one remote sensor each of them have 4.3 K Ohm pull up

Q2: are there long wires involved? No there aren't, 20 cm.

Q3: do the sensors work when tested stand-alone? Yes they works, please consider that in the remote server sensor board there are the same sensors and they work perfectly. Even in my previous Arduino app version with no remote sensor communication but with 6 local sensors there aren't problems

pberna67 commented 3 years ago

I've made an additional step in the analysis

Just changing the order of the include files the result (local temperature measurement) dramatically change (before the DS includes, than the Wifi includes)

include

include

include

include

the local temperature return value is frozen and the first return is wrong

TemperatureMonitor7MQTTClient_crash_4.zip

23:23:41.017 -> ⸮------- Multisensor temperature Acquisition System ---------- 23:23:41.017 -> DayTimerSec: 0 23:23:41.017 -> 23:23:41.017 -> Number of local temperature devices found...: 1 23:23:41.017 -> DS12B80 address: 282F5D07D6013C62 23:23:41.017 -> Connecting to WiFi network: Vodafone-27586605 23:23:41.155 -> Waiting for WIFI connection... 23:23:41.423 -> WiFi connected On event IP address: 192.168.10.52 23:23:41.968 -> Software interrupt 23:23:42.956 -> Software interrupt 23:23:43.949 -> Software interrupt 23:23:44.965 -> Software interrupt 23:23:45.983 -> Software interrupt 23:23:46.968 -> Software interrupt 23:23:47.983 -> Software interrupt 23:23:48.965 -> Software interrupt 23:23:49.981 -> Software interrupt 23:23:50.967 -> Software interrupt 23:23:50.967 -> Start measurements of the temperatures set... 23:23:51.001 -> Local temperature conversion start time: 10010 23:23:51.989 -> Software interrupt 23:23:52.975 -> Software interrupt 23:23:52.975 -> Local temperature value: -127.00 <---First read of the local temperature wrong 23:23:52.975 -> Local temperature conversion stop time: 12013 23:23:52.975 -> Send a request to read remote temperature sensor 23:23:52.975 -> READT command sent from: 192.168.10.52, port 1234 23:23:53.315 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.37 23:23:53.315 -> TEMP1: -127.00°C <---First read of the local temperature wrong 23:23:53.315 -> TEMP2: 19.37°C 23:23:53.965 -> Software interrupt 23:23:54.984 -> Software interrupt 23:23:55.965 -> Software interrupt 23:23:56.984 -> Software interrupt 23:23:57.969 -> Software interrupt 23:23:58.991 -> Software interrupt 23:23:59.981 -> Software interrupt 23:24:01.000 -> Software interrupt 23:24:01.000 -> Start measurements of the temperatures set... 23:24:01.000 -> Local temperature conversion start time: 20020 23:24:01.985 -> Software interrupt 23:24:02.971 -> Software interrupt 23:24:03.005 -> Local temperature value: 24.50 <---All next read of the local temperature are frozen to 24.5 degree even if the temperature change
23:24:03.005 -> Local temperature conversion stop time: 22049 23:24:03.005 -> Send a request to read remote temperature sensor 23:24:03.038 -> READT command sent from: 192.168.10.52, port 1234 23:24:03.447 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.44 23:24:03.447 -> TEMP1: 24.50°C <---All next read of the local temperature are frozen to 24.5 degree even if the temperature change
23:24:03.447 -> TEMP2: 19.44°C

but if the order of the #include files is (before the Wifi include, than the DS inlcudes)

include

include

include

include

TemperatureMonitor7MQTTClient_crash_3.zip

the local temperature is "allmost right" , first read wrong, than following are right

23:42:25.591 -> ⸮------- Multisensor temperature Acquisition System ---------- 23:42:25.591 -> DayTimerSec: 0 23:42:25.591 -> 23:42:25.591 -> Number of local temperature devices found...: 1 23:42:25.625 -> DS12B80 address: 282F5D07D6013C62 23:42:25.625 -> Connecting to WiFi network: Vodafone-27586605 23:42:25.761 -> Waiting for WIFI connection... 23:42:25.962 -> WiFi connected On event IP address: 192.168.10.52 23:42:26.567 -> Software interrupt 23:42:27.585 -> Software interrupt 23:42:28.570 -> Software interrupt 23:42:29.582 -> Software interrupt 23:42:30.561 -> Software interrupt 23:42:31.571 -> Software interrupt 23:42:32.552 -> Software interrupt 23:42:33.568 -> Software interrupt 23:42:34.590 -> Software interrupt 23:42:35.579 -> Software interrupt 23:42:35.579 -> Start measurements of the temperatures set... 23:42:35.579 -> Local temperature conversion start time: 10010 23:42:36.567 -> Software interrupt 23:42:37.590 -> Software interrupt 23:42:37.590 -> Local temperature value: -127.00 <---First read of the local temperature wrong 23:42:37.590 -> Local temperature conversion stop time: 12013 23:42:37.590 -> Send a request to read remote temperature sensor 23:42:37.590 -> READT command sent from: 192.168.10.52, port 1234 23:42:37.793 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.19 23:42:37.793 -> TEMP1: -127.00°C <---First read of the local temperature wrong 23:42:37.793 -> TEMP2: 19.19°C 23:42:38.575 -> Software interrupt 23:42:39.560 -> Software interrupt 23:42:40.578 -> Software interrupt 23:42:41.568 -> Software interrupt 23:42:42.591 -> Software interrupt 23:42:43.580 -> Software interrupt 23:42:44.596 -> Software interrupt 23:42:45.577 -> Software interrupt 23:42:45.577 -> Start measurements of the temperatures set... 23:42:45.577 -> Local temperature conversion start time: 20020 23:42:46.597 -> Software interrupt 23:42:47.581 -> Software interrupt 23:42:47.615 -> Local temperature value: 24.12 <---Following local temperature measurement are OK 23:42:47.615 -> Local temperature conversion stop time: 22049 23:42:47.615 -> Send a request to read remote temperature sensor 23:42:47.615 -> READT command sent from: 192.168.10.52, port 1234 23:42:47.818 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.19 23:42:47.818 -> TEMP1: 24.12°C <---Following local temperature measurement are OK 23:42:47.818 -> TEMP2: 19.19°C 23:42:48.564 -> Software interrupt 23:42:49.586 -> Software interrupt 23:42:50.608 -> Software interrupt 23:42:51.599 -> Software interrupt 23:42:52.584 -> Software interrupt 23:42:53.570 -> Software interrupt 23:42:54.593 -> Software interrupt 23:42:55.580 -> Software interrupt 23:42:55.615 -> Start measurements of the temperatures set... 23:42:55.615 -> Local temperature conversion start time: 30030 23:42:56.603 -> Software interrupt 23:42:57.593 -> Software interrupt 23:42:57.627 -> Local temperature value: 31.94 <---Following local temperature measurement are OK 23:42:57.627 -> Local temperature conversion stop time: 32059 23:42:57.627 -> Send a request to read remote temperature sensor 23:42:57.627 -> READT command sent from: 192.168.10.52, port 1234 23:42:57.662 -> Received temperature packet of size: 6 From 192.168.10.43, port 1234, contents is: 19.19 23:42:57.662 -> TEMP1: 31.94°C <---Following local temperature measurement are OK 23:42:57.662 -> TEMP2: 19.19°C

It seems clear now that there is an interaction problem between the UDP lib and the DallasTemperatre lib.

RobTillaart commented 3 years ago

(I'm try to follow this thread but have not enough time to investigate)

If the order of include files change behavior, that could be a (guarded) redefinition problem. E.g. If one include defines a type as int16 if not defined and the other as int32 that could cause trouble. However I do not know a scenario how it could freeze the sensor.

Have you tried to "guard" the actual sensor reads by disabling interrupts during the read?

pberna67 commented 3 years ago

I have inserted nointerrupt / interrupt before an after ds.requestTemperatures() and ds.getTempCByIndex(i); starting from the example TemperatureMonitor7MQTTClient_crash_4.zip

No effect, the problem still remains

Any other idea :) ?

T_MT_STATES isTemperatureMeasurementState(void)
{     
  bool completed = true;      
  float t;

    if (LocalMeasureState == MEASURING)
    {            
      // completed = ds.isConversionComplete();      

      if (TIMEOUT_EVENT(2))      
      { 
        for (int i = 0; i<dsDeviceCount; i++)
        {        
          **noInterrupts();**
          t = c    // bug ! it doesn't return the measure
          **interrupts();**  
          //t = ds.getTempCByIndex(i);  // if read 2 time then work, but if the program runs for hours the return value is frozen,
                                        // to a specific value that doesn't change if the temperature change.
          Temperatures[i] = t;
        } 

.....

T_MT_STATES TemperatureMeasurement_start(void)
{            
    Serial.print("Local temperature conversion start time: ");
    Serial.println(millis());
    **noInterrupts();**
    ds.requestTemperatures();             // Send the command to start conversion of temperatures    
    **interrupts();**
    RELOAD_TIMER(2);
    LocalMeasureState = MEASURING;
    return(LocalMeasureState);
}
pberna67 commented 3 years ago

I've created the same sketch using TCP instead UDP, and I've encountered the same problem. So the instability of the library appears anyway if the complexity of the program increase

sebitnt commented 3 years ago

Hi! I had some similar issues. What finally helped was to add ICACHE_RAM_ATTR to the used DallasTemperature and also to the therefrom called OneWire function declarations. There seem to be some timing issues in the ESP when using the WiFi functions while the temperature conversion in in progress when not using this attribute.

pberna67 commented 3 years ago

Hi! I had some similar issues. What finally helped was to add ICACHE_RAM_ATTR to the used DallasTemperature and also to the therefrom called OneWire function declarations. There seem to be some timing issues in the ESP when using the WiFi functions while the temperature conversion in in progress when not using this attribute.

Hi sebitnt

Can you clarify where/how you placed the ICACHE_RAM_ATTR in the code ? Have you completely solved the problem ?

Regards Paolo

sebitnt commented 3 years ago

Can you clarify where/how you placed the ICACHE_RAM_ATTR in the code ?

Sure! But you should assure that the following changes can only work, if you are using the same commands as myself. For my case these were requestTemperatures() in synchronous mode and getTempC(address). If you are using other functions to get the temperatures, you should trace back which exact functions (in DallasTemperature and in OneWire) are called from these.

In DallasTemperature.h: -void blockTillConversionComplete(uint8_t); +void ICACHE_RAM_ATTR blockTillConversionComplete(uint8_t);

-bool isConversionComplete(void); +bool ICACHE_RAM_ATTR isConversionComplete(void);

-void requestTemperatures(void); +void ICACHE_RAM_ATTR requestTemperatures(void);

In OneWire.h: -uint8_t reset(void); +uint8_t ICACHE_RAM_ATTR reset(void);

-void skip(void); +void ICACHE_RAM_ATTR skip(void);

-void write(uint8_t v, uint8_t power = 0); +void ICACHE_RAM_ATTR write(uint8_t v, uint8_t power = 0);

-uint8_t read(void); +uint8_t ICACHE_RAM_ATTR read(void);

-void write_bit(uint8_t v); +void ICACHE_RAM_ATTR write_bit(uint8_t v);

-uint8_t read_bit(void); +uint8_t ICACHE_RAM_ATTR read_bit(void);

I had some issues when setWaitForConversion(true) while using requestTemperatures(). It did fire too early often when sending data through the WiFi interface. After starting the ESP it went good for about 15 seconds and after that, the timing issues occurred. After applying the above changes, the problems are completely gone. Let us know if this also was helpful for you!

pberna67 commented 2 years ago

Sorry dor the delay in the feedback. Adding the ICACHE_RAM_ATTR to the indicated functions, I'm seeing a very big improvement ! It seems to work properly. Thanks

Dear Developer Can you validate and add this changes in the official release ?

RobTillaart commented 2 years ago

The maintainer cannot add the ICACHE_RAM_ATTR in the oneWire.h library as that is an external library. I do not know if adding it to the Dallas lib only will work. Personally no time to test/verify this.