espressif / arduino-esp32

Arduino core for the ESP32
GNU Lesser General Public License v2.1
13.34k stars 7.37k forks source link

Problem with ds18b20 #1335

Closed smacyas closed 6 years ago

smacyas commented 6 years ago

After using the latest Arduino-esp32 there is a problem with ds18b20 (library - https://github.com/stickbreaker/OneWire and https://github.com/milesburton/Arduino-Temperature-Control-Library)

No presence device on the bus.

Has anyone encountered a problem?

everslick commented 6 years ago

I can confirm, that it does not work for me as well. Unfortunately, that is all I can say for now, because somebody else tested my code, and my sensor did not arrive yet. I guess it will take another week at least, until I can look into it.

First thing you could look at, is if there is data on the OneWire bus at all. Do you have a scope?

smacyas commented 6 years ago

Yes, i have scope and Saleae Logic Analyser. I see, that ROM command error. but device presense pulse is present.

smacyas commented 6 years ago

ds18b20

smacyas commented 6 years ago

image

smacyas commented 6 years ago

I changed timing in 2 func:

void OneWire::write_bit(uint8_t v) { IO_REG_TYPE mask IO_REG_MASK_ATTR = bitmask; volatile IO_REG_TYPE *reg IO_REG_BASE_ATTR = baseReg; if (v & 1) { noInterrupts(); DIRECT_WRITE_LOW(reg, mask); DIRECT_MODE_OUTPUT(reg, mask); delayMicroseconds(5); //10 DIRECT_WRITE_HIGH(reg, mask); interrupts(); delayMicroseconds(90); //55 } else { noInterrupts(); DIRECT_WRITE_LOW(reg, mask); DIRECT_MODE_OUTPUT(reg, mask); delayMicroseconds(90); //65 DIRECT_WRITE_HIGH(reg, mask); interrupts(); delayMicroseconds(5); } }

// // Read a bit. Port and bit is used to cut lookup time and provide // more certain timing. // uint8_t OneWire::read_bit(void) { IO_REG_TYPE mask IO_REG_MASK_ATTR = bitmask; volatile IO_REG_TYPE *reg IO_REG_BASE_ATTR = baseReg; uint8_t r;

noInterrupts(); DIRECT_MODE_OUTPUT(reg, mask); DIRECT_WRITE_LOW(reg, mask); delayMicroseconds(2); //3 DIRECT_MODE_INPUT(reg, mask); // let pin float, pull up will raise delayMicroseconds(8); //10 r = DIRECT_READ(reg, mask); interrupts(); delayMicroseconds(80); //53 return r; }

it is works for me

stickbreaker commented 6 years ago

@smacyas Glad that you have found a solution, I have not had time to chase down this problem.

Before anyone jumps on the interrupts(); noInterrupts()

// at start of OneWire.cpp
#ifdef ARDUINO_ARCH_ESP32
#define noInterrupts() {portMUX_TYPE mux = portMUX_INITIALIZER_UNLOCKED;portENTER_CRITICAL(&mux)
#define interrupts() portEXIT_CRITICAL(&mux);}
#endif

// at end of OneWire.cpp
#ifdef ARDUINO_ARCH_ESP32
#undef noInterrupts()
#undef interrupts()
#endif

to keep code compatibility with the Original OneWire, I had to create a nested local MUTEX and unbalanced braces.

Chuck.

everslick commented 6 years ago

I think I have the very same issue like @smacyas . even though I made the exact same changes to OneWire. screenshot_2018-04-28_20-55-56

@stickbreaker at a quick glance, do you see anything suspicious?

i guess i have to read up on onewire a little bit...

stickbreaker commented 6 years ago

This might be a problem with microSecond() The OneWire protocol defines a '1' as a low going pulse of between 1 to 15us and a '0' as a low going pulse of 60us. I submitted a fix to microSecond() consistency problems between multiple tasks. I had to use Thread local storage. It might take too long for the 5us '1' pulses.

@everslick I can't tell with your capture what the actual pulse widths are. Increase the resolution the '1's should be 5us and the '0' should be 60us or 12 times longer. It doesn't look correct. Might need to create a high resolution microSeconds() or use the current task's cycle count

portMUX_TYPE microsMux = portMUX_INITIALIZER_UNLOCKED;
static unsigned long lastMicros=0;
static unsigned long overFlowMicros = 0;

unsigned long fastMicros(){
  unsigned long ccount;

    portENTER_CRITICAL_ISR(&microsMux);
    __asm__ __volatile__ ( "rsr     %0, ccount" : "=a" (ccount) ); //get cycle count
    if(ccount < lastMicros) { // overflow occurred
        overFlowMicros += UINT32_MAX / CONFIG_ESP32_DEFAULT_CPU_FREQ_MHZ;
    }

    lastMicros = ccount;
    portEXIT_CRITICAL_ISR(&microsMux);
    return (overFlowMicros + ccount / CONFIG_ESP32_DEFAULT_CPU_FREQ_MHZ);
}

void fastDelayMicros(unsigned long us){
{
    uint32_t m = fastMicros();
    if(us){
        uint32_t e = (m + us);
        if(m > e){ //overflow
            while(fastMicros() > e){
                NOP();
            }
        }
        while(fastMicros() < e){
            NOP();
        }
    }
}

Or, something else to speed up microSeconds() My original microSeconds() Solution solved the consistency problem when microSeconds() was called between different task. Each task has a separate cyclecount. OneWire Does not care about absolute microseconds, just relative values for delaying. Maybe change microSeconds() so that you can choose either absolute (slow) or relative(fast). Fast uses static local variables for rollover checking.

How about two microSeconds(): The standard that uses thread local storage for rollover, the other has as var parameter.

portMUX_TYPE microsMux = portMUX_INITIALIZER_UNLOCKED;

unsigned long microSeconds(unsigned long * lastCount, unsigned long * overFlowMicros){

  if(!(lastCount && overFlowMicros)) return 0; // have to have storage

  unsigned long ccount;

    portENTER_CRITICAL_ISR(&microsMux);
    __asm__ __volatile__ ( "rsr     %0, ccount" : "=a" (ccount) ); //get cycle count
    if(ccount < *lastCount) { // overflow occurred
        *overFlowMicros += UINT32_MAX / CONFIG_ESP32_DEFAULT_CPU_FREQ_MHZ;
    }

    *lastCount = ccount;
    portEXIT_CRITICAL_ISR(&microsMux);
    return (*overFlowMicros + ccount / CONFIG_ESP32_DEFAULT_CPU_FREQ_MHZ);
}

void fastDelayMicros(unsigned long us){
{
  uint32_t lastCount=0;
  uint32_t overFlowMicros = 0;
    uint32_t m = microSeconds(&lastCount,&overFlowMicros);
    if(us){
        uint32_t e = (m + us);
        if(m > e){ //overflow
            while(microSeconds(&lastCount,&overFlowMicros) > e){
                NOP();
            }
        }
        while(microSeconds(&lastCount,&overFlowMicros) < e){
            NOP();
        }
    }
}

What do you think?

Chuck.

everslick commented 6 years ago

@stickbreaker Thank you very much for having taken the time to look into this and for your, as usual, great comment. I agree to your analysis of the problem.

For all further tests I reverted all delays to the original values in OneWire.

sigrok1

The 5th bit (blue cursor) should probably be 1 but because the low pulse get stretched (to 36us) due to jitter for a task switch, it becomes a 0.

I will test your different delay methods from you comment above now...

everslick commented 6 years ago

I added fastDelayMicros(unsigned long us) and microSeconds(unsigned long * lastCount, unsigned long * overFlowMicros) to OneWire.cpp plus a macro #define delayMicroseconds(_US_) fastDelayMicros(_US_)

but the capture looks identical:

sigrok2

What I do not really get is why locking the mutex is supposed to prevent a taskswitch. I have seen the same in other places in the Arduino Core, but it is not confirming to my understanding of mutexes. A mutex is protecting a shared resource from access by two different tasks/processes. Here we only have one task locking the mutex. Maybe locking a mutex is exploiting a platform specific side-effect of preventing a taskswitch here?

everslick commented 6 years ago

Reading your comment again, I think I misunderstood your solution. The mutex does indeed only protect the microSeconds counter from being messed up when called from different tasks. But to me it looks more like that the task that is currently bitbanging the onewire is prevented from running by a task with higher priority. But why would it always happen at this 5th bit? Does not make sense. I'm clearly missing something here.

stickbreaker commented 6 years ago

@everslick Try stripping out the Enter/Exit critical in fastMicros() all of the time sensitive delays are already inside a criticalSection.

It is my understanding the from Read the Docs/ Critical-Sections the Mutex is to fence out the other core, In this usage, we don't need to fence out the other core, just keep this task active. So, the Mutex is just baggage. portEnter_Critical() disables task switching on timer interrupts. My blodge of redefining interrupts/noInterrupts was to keep code compatibility. It might be better to optimize OneWire for the ESP32, but, I don't want to maintain it.

everslick commented 6 years ago

hmm, after some more thinking it dawned on me, it could be something else. I realized, that it is always the first bit that is 1 that suffers from the jitter. That means a code path taken the first time. Maybe a cache miss, that delays execution? So I put the write_bit() into IRAM...

void IRAM_ATTR OneWire::write_bit(uint8_t v)

sigrok3

I even removed your noInterrupt() and interrupt() macros.

stickbreaker commented 6 years ago

@everslick That looks good!
What are you using for delayMicros()? The origional code or that fastMicros()?

I totally forgot about the Cache delay issues, Good Catch!

If you expand your capture, what do the raw pulses look like? How close are they to the 10 and 60us delays?

Chuck.

everslick commented 6 years ago

The 1's are pretty stable at 12us (with delay(10)) with the original delayMicroseconds(). The 0's are 66us with delay(65).

So far no jitter.

stickbreaker commented 6 years ago

@everslick How complex is your test code? I am worried about removing the interrupts/noInterrupts macro in code that uses WiFi or other interrupt functions.

Does the OneWire code function correctly with just the one change, IRAM_ATTR on writeBit()?

Chuck.

everslick commented 6 years ago

I guess my test code is as complex as it gets. The firmware has 1280KB binary size. It runs data logging to SPIFFS/SDFS (dependent on if a SD card is present), MQTT over TLS, SPI (bitbanging) ADE7758 protocol with IRQ handling task, NTP, webserver, websockets, telnet, serial console, I2C (DS3231, EEPROM,...) ... but i'm not really stressing the OneWire bus, yet. I just read the temp sensor once in a while.

Yes, the only change I made to OneWire.cpp is removing the macros and putting write_bit() and read_bit() into IRAM.

Tomorrow I will try reading the temp like crazy and see if it breaks. ;-)

cnoork commented 6 years ago

I have also a problem with reading the ds18b20 on the esp32 while it the same sensor is working perfectly on a esp8266. It get many times wrong temperature but also correct ones. Below an example from the DallasTemperature/Simple (there is a 4k8 pull-up resistor). With another sketch I see that the data from the ds18b20 has one byte more when the temperature is not correct, is this the same issue with timing.

========== Requesting temperatures...DONE Temperature for the device 1 (index 0) is: 20.50 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: -127.00 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: 20.50 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: -127.00 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: -127.00 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: 20.50 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: -127.00 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: 20.56 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: 20.56 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: 20.56 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: -127.00 Requesting temperatures...DONE Temperature for the device 1 (index 0) is: -127.00

everslick commented 6 years ago

@cnoork have you added IRAM_ATTR to write_bit() and read_bit() in OneWire.cpp? do you use the stickbreaker enhanced OneWire library?

stickbreaker commented 6 years ago

@everslick just applied your IRAM_ATTR to my fork. I don't have a testbed running to test it, Could you verify I did not screw it up.

Chuck.

everslick commented 6 years ago

@stickbreaker I'm currently running your fork, and it looks good. I have sometimes wrong readings (-127.0) regardless of using your noInterrupt() macro or not. I'm still not sure if they improve stability. Anyway, your fork works! :+1:

cnoork commented 6 years ago

@stickbreaker I have placed the new library and it is working for me now, till now no strange values seen anymore. Thnx for the fast modification. I will now implement it in my "bigger" project.

everslick commented 6 years ago

I will close this issue for now. If people still have issues, feel free to reopen it.

OmarAlkassab commented 4 years ago

@everslick I’m facing same issue on ESP32 with latest core version and using @stickbreaker One Wire library, when running the Dallas Temperature library on Core 0 and using both Wire (for printing on LCD and controlling PCA9555 IO expander) and Wire1 (for reading DS1307 RTC). Many times I got the Value -127 (about 3 time for 10 readings) Is there any improvement of the library for ESP32? Many thanks in advance.

everslick commented 4 years ago

I don't think there have been any changes here. Have you tried to simplify your setup, so you can rule out any side effects from other code?

OmarAlkassab commented 4 years ago

I tries to assign the task that read the temperature to the core 1, and this slightly reduce the error rate. Is this logical?

everslick commented 4 years ago

I think each Wire instance should only be accessed from one RTOS task. Is that the case in your code?

OmarAlkassab commented 4 years ago

I created a task that reads every one second (I use delay) 7 sensors (4 analog sensors, 2 DS18B20 waterproof version each one on different pin, one DHT11 humidity and temperature sensor) all from the same task. I will post the code if there’s anything not clear.

stickbreaker commented 4 years ago

@OmarAlkassab The OneWire protocol is very strict with its timing. Depending on how many other higher priority processes are running you will see comm failures. @everslick and I have worked to provide a stable library that works for us. The problem is that OneWire protocol has 15us and 60us bit timing, and the complete transactions can exceed 10ms. If you have multiple tasks and interrupts(I2C, Serial, WiFi) it can mess them up. With the 240MHz processor clock, a 1ms period equates to 240,000 instructions. The FreeRtos environment is designed to use this processor speed to support multiple tasks. The only way to produce 100% successful OneWire would be to exclusively use a processor, perhaps a dedicated ULP task.

I would be interested in helping you enhance the library. I would need additional work on your part, specifically some logic analyzer captures of the OneWire channel. Captures of both working and failing OneWire transactions. Sparkfun sells a cheap 24MHz 8 channel USB Logic Analyzer, it uses PulseView software. You can also find them on Ebay, Amazon, or BangGood.

To effectively debug the problem you would need additional output pins on the ESP32 for synchronization, Probably 2 pins, one for bit timing and the other for transaction boundaries.

If you are interested lets take this converstation over to my OneWire Library Repo Stickbreaker/OneWire#3

Chuck.

deg026 commented 4 years ago

tnx guys! i have tried to solve my problems at least one week... until i have founded your fix! my ESP32 + dallas now working wine! big tnx to @stickbreaker

osenbruggen commented 3 years ago

Thank you very much for this! I had been struggling with inconsistency as well.

PragunJaswal commented 1 year ago

I changed timing in 2 func:

void OneWire::write_bit(uint8_t v) { IO_REG_TYPE mask IO_REG_MASK_ATTR = bitmask; volatile IO_REG_TYPE *reg IO_REG_BASE_ATTR = baseReg; if (v & 1) { noInterrupts(); DIRECT_WRITE_LOW(reg, mask); DIRECT_MODE_OUTPUT(reg, mask); delayMicroseconds(5); //10 DIRECT_WRITE_HIGH(reg, mask); interrupts(); delayMicroseconds(90); //55 } else { noInterrupts(); DIRECT_WRITE_LOW(reg, mask); DIRECT_MODE_OUTPUT(reg, mask); delayMicroseconds(90); //65 DIRECT_WRITE_HIGH(reg, mask); interrupts(); delayMicroseconds(5); } }

// // Read a bit. Port and bit is used to cut lookup time and provide // more certain timing. // uint8_t OneWire::read_bit(void) { IO_REG_TYPE mask IO_REG_MASK_ATTR = bitmask; volatile IO_REG_TYPE *reg IO_REG_BASE_ATTR = baseReg; uint8_t r;

noInterrupts(); DIRECT_MODE_OUTPUT(reg, mask); DIRECT_WRITE_LOW(reg, mask); delayMicroseconds(2); //3 DIRECT_MODE_INPUT(reg, mask); // let pin float, pull up will raise delayMicroseconds(8); //10 r = DIRECT_READ(reg, mask); interrupts(); delayMicroseconds(80); //53 return r; }

it is works for me

it works for me also with dallas lib. :)