RobTillaart / DHTNew

Arduino library for DHT11 and DHT22 with automatic sensor recognition
MIT License
98 stars 15 forks source link

error reading am2302 over longer wire #29

Closed Jurand2020 closed 4 years ago

Jurand2020 commented 4 years ago

When using longer wire for am2302 (like 1-5m) there is always error on read. Checked with oscilloscope - the signal looks fine. Grove_Temperature_And_Humidity_Sensor library had no problem with the same hardware setup. I've used ESP32 DevKit V1.

RobTillaart commented 4 years ago

Grove uses the Adafruit library if I'm right. I need to compare what they do different to analyse a root cause and I do not have time now to dive into that. But some questions to get the analysis started. From my head I recall that Adafruit measures timing of the pulses differently than DHTNEW does.

Thanks,

Jurand2020 commented 4 years ago

Over short wire (10cm) all works perfect (my app / examples). On 1m wire in my app first read works, then no-go. Test result as follow:

ets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0018,len:4 load:0x3fff001c,len:1044 load:0x40078000,len:8896 load:0x40080400,len:5816 entry 0x400806ac dhtnew_test.ino LIBRARY VERSION: 0.3.2

  1. Type detection test, first run might take longer to determine type STAT HUMI TEMP TIME TYPE OK, 48.7, 26.8, 5417, 22 OK, 48.7, 26.8, 2, 22 OK, 48.7, 26.8, 2, 22 OK, 48.7, 26.8, 2, 22

  2. Humidity offset test STAT HUMI TEMP TIME TYPE Bit shift error, -999.0, -999.0, 5337, 22 OK, -999.0, -999.0, 2, 22 OK, -999.0, -999.0, 2, 22 OK, -999.0, -999.0, 2, 22

  3. Temperature offset test STAT HUMI TEMP TIME TYPE Bit shift error, -999.0, -999.0, 5238, 22 OK, -999.0, -999.0, 2, 22 OK, -999.0, -999.0, 2, 22 OK, -999.0, -999.0, 2, 22

  4. LastRead test -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0

Done...


Tried with Threshold 40 and 65 - result similar, e.g. for 65: ets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0018,len:4 load:0x3fff001c,len:1044 load:0x40078000,len:8896 load:0x40080400,len:5816 entry 0x400806ac dhtnew_test.ino LIBRARY VERSION: 0.3.2

  1. Type detection test, first run might take longer to determine type STAT HUMI TEMP TIME TYPE OK, 47.2, 27.4, 5274, 22 OK, 47.2, 27.4, 3, 22 OK, 47.2, 27.4, 3, 22 OK, 47.2, 27.4, 3, 22

  2. Humidity offset test STAT HUMI TEMP TIME TYPE Bit shift error, -999.0, -999.0, 5240, 22 OK, -999.0, -999.0, 3, 22 OK, -999.0, -999.0, 2, 22 OK, -999.0, -999.0, 3, 22

  3. Temperature offset test STAT HUMI TEMP TIME TYPE Bit shift error, -999.0, -999.0, 5190, 22 OK, -999.0, -999.0, 3, 22 OK, -999.0, -999.0, 3, 22 OK, -999.0, -999.0, 3, 22

  4. LastRead test -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 actual read -999.0, -999.0 -999.0, -999.0 -999.0, -999.0 -999.0, -999.0

Done...

RobTillaart commented 4 years ago

(still little time, I'll try to spend some every day)

Bit shift error, has been investigated in https://github.com/RobTillaart/DHTNew/issues/11 hoped it was solved as the problem did not occur anymore.


Over short wire (10cm) all works perfect (my app / examples). On 1m wire in my app first read works, then no-go. Test result as follow:

So when the line is short it works perfect, when the line is longer it starts to fail. All that changes is the capacity of the wire (increases linear with the length), I assume the connections themselves are OK as Adafruit lib does work.

What pull up are you using? (10K ?) Can you try a smaller one e.g. 4K7, 2K2 or even 1K?


test 2: can you add this line in setup() of the test sketch to see the values you get....

DHT.setSuppressError(true);
Jurand2020 commented 4 years ago

The sensor has built-in 4.7k pull up. If it is of any help - this is part of the transmission (near end) 20200814_011416

Jurand2020 commented 4 years ago

here in excel, one row=2us

SGNAL002.zip

Mr-HaleYa commented 4 years ago

If you are using the one from the adafuit website they say

"There is a 5.1K resistor inside the sensor connecting VCC and DATA so you do not need any additional pullup resistors"

I have no idea where you got yours because you posted no pictures or said where you did

They also say

"DHT22 and AM2302 often have a pullup already inside, but it doesn't hurt to add another one!"

RobTillaart commented 4 years ago

Quick feedback At first sight the signal looks OK, will check the Excel this evening if time permits.

Adding an extra pull up resistor will lower the total R which would give sharper edges.

Another point of attention: use twisted pair for long wires improves signal quality (think ethernet).

Mr-HaleYa commented 4 years ago

use twisted pair for long wires improves signal quality (think ethernet).

Yes, I can agree with that... It's crazy how much a difference it can make over long distances

"Twisted pair cabling is a type of wiring in which two conductors of a single circuit are twisted together for the purposes of improving electromagnetic compatibility. Compared to a single conductor or an untwisted balanced pair, a twisted pair reduces electromagnetic radiation from the pair and crosstalk between neighboring pairs and improves rejection of external electromagnetic interference"

Read more here https://en.wikipedia.org/wiki/Twisted_pair It's very interesting

RobTillaart commented 4 years ago

Voltage convertor Another way to improve the signal is to put a bidirectional voltage convertor between the ESP and the sensor (close to the ESP). That gives 50% more energy to get a proper / strong signal. (e.g. https://www.adafruit.com/product/757

Scope image remark From the scope image I see 3.40 Volt (3.26 + 140mV) that is just within specs, so should be enough.

from the datasheet

Model DHT22
Power supply 3.3 - 6V DC
Output signal digital signal via single-bus
Sensing element Polymer capacitor
Operating range humidity 0-100% RH; temperature -40~80 Celsius
Accuracy humidity +-2%RH(Max +-5% RH); temperature <+-0.5 Celsius
Resolution or sensitivity humidity 0.1% RH; temperature 0.1 Celsius
Repeatability humidity +-1% RH; temperature +-0.2 Celsius
Humidity hysteresis +-0.3% RH
Long-term Stability +-0.5% RH/year
Sensing period Average: 2s
Interchangeability fully interchangeable
Dimensions small size 14 x 18 x 5.5 mm; big size 22 x 28 x 5 mm
Jurand2020 commented 4 years ago

I tried different cabling before switching to "Grove" lib - no difference. At first I powered the sensor with 5V, now 3.3V - I wanted to avoid level converter. I've confirmed the pull-up presence with multimeter (sensor looks like https://www.adafruit.com/product/393) One more interesting thing - after ESP reset button (and test restart) it reads properly at "Type detection test".

RobTillaart commented 4 years ago

So the type is detection is OK

Humidity offset test STAT HUMI TEMP TIME TYPE Bit shift error, -999.0, -999.0, 5240, 22 OK, -999.0, -999.0, 3, 22 OK, -999.0, -999.0, 2, 22 OK, -999.0, -999.0, 3, 22

I see that the actual read takes 5240 milliseconds, that is a normal time indicating that all 40+ bits are read as they should . There is no premature timeout. That is a good sign as it states it gets enough bits and sees 40+ pulses.

The three reads that say OK, are cached values confirmed by the response time of ~3 millis. If the first actual read misses the cache will follow. Caching is done to prevent to send too much requests to the processor within the sensor. Datasheet indicates two seconds between actual reads.


ANALYSIS EXCEL FILE

image

Note the first pulse has a small dip in its second half, remaining look normal. As voltage is 3.4V and near the 3.3V operating voltage this might be something.

The Excel file shows 42 pulses, first 2 are start bits temp bytes 0x01 0xEE = 256 + 238 = 494 / 10 = 49.4 hot but in line with the good read hum bytes 0x01 0x18 = 256 + 24 = 280 / 10 = 28.0 in line with the good read CRC byte 0x08 the bytes = 0x01 + 0xEE + 0x01 + 0x18 = 0x0108 ==> 08 so CRC is correct.

So the scope recording looks OK.

The Excel file has no timing info. Q: Do you know the time between samples? 1 microsecond?


TEST 03

The BIT SHIFT ERROR is detected at the end of _readSensor().

about line 291   if (_bits[0] & 0x80) return DHTLIB_ERROR_BIT_SHIFT;

Can you please comment this line in the library, run the same test and show the output.

Scenario's possible

  1. it shows good values and returns OK
  2. it shows faulty values and returns OK
  3. it show an CRC error
  4. it shows another error
  5. something else

I can reverse engineer the bit patten from the temp /hum to try to understand what is exactly happening.

Scenario 1, would indicate the bit shift detection is incorrect.

Scenario 2. would indicate a new unknown problem. It is known that multiple bit errors may cancel each others CRC error. The chances for this are 1 in 256 in theory, so not zero.

Scenario 3 would need analysis of the values.


TEST 02

can you add this line in setup() of the test sketch to see the values you get....

DHT.setSuppressError(true);

TEST 01 Add an (extra) pull up resistor between data line and VCC

Jurand2020 commented 4 years ago

About the timing.. My oscilloscope says it is 50us / div - but it's only true on the display. I think it is 2us a row. The photo and the recording is from the very same run. . I've also noted the dip. No idea where it comes from. Your decoding seems ok if the 28 is the temperature. Look at the readings in the first of the tests.

Jurand2020 commented 4 years ago

TEST 02 - added DHT.setSuppressError(true); Temp. in my room is between 25 and 26 ;) Result:

ets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0018,len:4 load:0x3fff001c,len:1044 load:0x40078000,len:8896 load:0x40080400,len:5816 entry 0x400806ac dhtnew_test.ino LIBRARY VERSION: 0.3.2

  1. Type detection test, first run might take longer to determine type STAT HUMI TEMP TIME TYPE OK, 57.9, 25.4, 5318, 22 OK, 57.9, 25.4, 2, 22 OK, 57.9, 25.4, 2, 22 OK, 57.9, 25.4, 2, 22

  2. Humidity offset test STAT HUMI TEMP TIME TYPE Bit shift error, 57.9, 25.4, 5193, 22 OK, 57.9, 25.4, 2, 22 OK, 57.9, 25.4, 2, 22 OK, 57.9, 25.4, 2, 22

  3. Temperature offset test STAT HUMI TEMP TIME TYPE Bit shift error, 57.9, 25.4, 5095, 22 OK, 57.9, 25.4, 2, 22 OK, 57.9, 25.4, 2, 22 OK, 57.9, 25.4, 2, 22

  4. LastRead test 57.9, 25.4 57.9, 25.4 57.9, 25.4 57.9, 25.4 57.9, 25.4 actual read 57.9, 25.4 actual read 57.9, 25.4 actual read 57.9, 25.4 actual read 57.9, 25.4 57.9, 25.4 57.9, 25.4 57.9, 25.4 57.9, 25.4 actual read 57.9, 25.4 actual read 57.9, 25.4 actual read 57.9, 25.4 actual read 57.9, 25.4 57.9, 25.4 57.9, 25.4 57.9, 25.4

Done...

Jurand2020 commented 4 years ago

Test 03: Commented out line 291 if (_bits[0] & 0x80) return DHTLIB_ERROR_BIT_SHIFT;

Result ets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0018,len:4 load:0x3fff001c,len:1044 load:0x40078000,len:8896 load:0x40080400,len:5816 entry 0x400806ac dhtnew_test.ino LIBRARY VERSION: 0.3.2

  1. Type detection test, first run might take longer to determine type STAT HUMI TEMP TIME TYPE OK, 57.5, 25.9, 5273, 22 OK, 57.5, 25.9, 2, 22 OK, 57.5, 25.9, 2, 22 OK, 57.5, 25.9, 2, 22

  2. Humidity offset test STAT HUMI TEMP TIME TYPE Checksum error, 100.0, -12.9, 5100, 22 OK, 100.0, -12.9, 2, 22 OK, 100.0, -12.9, 2, 22 OK, 100.0, -12.9, 2, 22

  3. Temperature offset test STAT HUMI TEMP TIME TYPE Checksum error, 100.0, -10.4, 5100, 22 OK, 100.0, -10.4, 2, 22 OK, 100.0, -10.4, 2, 22 OK, 100.0, -10.4, 2, 22

  4. LastRead test 100.0, -12.9 100.0, -12.9 100.0, -12.9 100.0, -12.9 100.0, -12.9 actual read 100.0, -12.9 actual read 100.0, -12.9 actual read 100.0, -12.9 actual read 100.0, -12.9 100.0, -12.9 100.0, -12.9 100.0, -12.9 100.0, -12.9 actual read 100.0, -12.9 actual read 100.0, -12.9 actual read 100.0, -12.9 actual read 100.0, -12.9 100.0, -12.9 100.0, -12.9 100.0, -12.9

Done...

Jurand2020 commented 4 years ago

I've made another recording - during test 03 after section 1 completed: signal003.zip

I've changed mesurement point, but I can't tell any difference because of that.

Regarding adding another pull-up: cannot do it right now, will do next week.

RobTillaart commented 4 years ago

Your decoding seems ok if the 28 is the temperature. Look at the readings in the first of the tests. OK I swapped hum and temp, but signal pulses are quite OK

timing 2 us / sample; need to check the sheet, maybe I can reverse calculate the timing given the 'normalized' pulse length for zero and 1

new run The new run makes completely sense with the CRC error. I see the same patterns as how the BIT SHIFT error was found. Good news is that 40 samples are there Bad news is that lib sees one too many at the beginning and therefor misses the last real bit ... This effect is caused by the cable length as that is the only thing changed. Good news is that it is highly repeatable and fails much faster that in previous time.

Question is: how to detect the BIT SHIFT ERROR runtime and read that extra bit? An ESP at 240 MHz can, no doubt, however the solution should not 'crash' the handshake on a 16 Mhz UNO. I can shift all bits afterwards and ignore/patch the CRC, (wrote that earlier) but that is a last resort workaround, not a solution.

Need time to think this over (again), might take a couple of days. Hope you find time to do the pull-up test

Rob

Jurand2020 commented 4 years ago

I'm not confident about the timing in the recording (per row). I need to report this problem to my osciloscope firmware authors ;) . I've measured timing for '1' and '0' via OSD and got ~75-78 vs ~24-27

Jurand2020 commented 4 years ago

I cant see which bit is the problem. In signal003 I found 00000010 00110010 00000001 00000111 CRC 00111100. This is a correct signal, isn't it?

RobTillaart commented 4 years ago

Goodmornig. Hum bits 512 " 50 makes 56.2 Temp bits 256 + 7 makes 26.3 Crc adds up 60 So pulses represent a valid pattern.

RobTillaart commented 4 years ago

Pulses of 24-27 and 75-78 can very well be discriminated. Even the bits in the failing read are recognized well except for the start bits. The root cause is somewhere there. To be continued.

RobTillaart commented 4 years ago

I cant see which bit is the problem.

if you look at this image of before there are 42 pulses. the 5 bytes that contain the information are the last 40 The problem is in the first two pulses somehow. The decoding sees the second pulse as first pulse of the 5 bytes while it should start collecting on the 3rd pulse.

observations If the wire is short it does so correctly, If the wire is long it fails.

image

I'm going to reread the datasheet (again) about these 2 pulses..

RobTillaart commented 4 years ago

Small sketch that measures timing of the steps.

dhtnew_pulse_diag.zip

Output should look like this, (ESP32) it shows the periods of LOW and HIGH start 1102 == LOW the values of 22 and 72 shows nicely the 0 and 1 bits.

RUN:    76
IDX:    89
WAKEUP
    1102    1   21  75  79  1
HUM
    55  22  54  22  54  22  54  22  54  22  54  22  54  72  54  22
    54  71  55  21  54  22  54  72  54  72  54  72  54  72  54  22

TEMP
    60  22  54  22  54  22  54  21  55  21  54  22  54  22  54  72
    54  22  54  22  54  22  54  72  54  22  54  22  54  22  54  22

CRC
    60  72  53  22  54  72  54  72
    54  22  54  22  54  22  54  72

BYE
    52  1
RobTillaart commented 4 years ago

Hypothesis: The DHTNEW lib does 'poll' the start bits for LOW and HIGH status at "max speed" (line 224 - 243 in cpp file) If there is a glitch during this tight loop it falls through and start reading the 40 databits prematurely. Then a start bit could be seen as the first data bit.

Will make a branch that handles the start bits more robustly..

RobTillaart commented 4 years ago

TEST 04 Created a branch test_29 (issue nr) which handles the start bits with less polling. Can you test it works with your long cables? I verified it works with short cables on a breadboard.

If this works for the long cables we're on closing in on the root cause. It will not be the final fix as this code does mimic the datasheet less than before.

Jurand2020 commented 4 years ago

pulse_diag result: RUN: 7 IDX: 89 WAKEUP 1107 3 25 78 82 0 HUM 54 26 54 26 54 26 54 26 54 26 54 26 54 74 54 25 67 26 54 74 54 26 54 73 54 26 54 74 54 26 54 73

TEMP 67 26 54 26 54 26 54 26 54 26 54 26 54 26 54 73 67 26 55 26 53 27 54 26 54 79 48 26 54 26 54 25

CRC 65 26 54 73 54 74 54 26 54 26 54 26 54 26 54 25

BYE 47 1

Jurand2020 commented 4 years ago

test_29 @a2afb04 result:

ets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0018,len:4 load:0x3fff001c,len:1044 load:0x40078000,len:8896 load:0x40080400,len:5816 entry 0x400806ac dhtnew_test.ino LIBRARY VERSION: 0.3.2

  1. Type detection test, first run might take longer to determine type STAT HUMI TEMP TIME TYPE OK, 60.3, 26.5, 5312, 22 OK, 60.3, 26.5, 2, 22 OK, 60.3, 26.5, 2, 22 OK, 60.3, 26.5, 2, 22

  2. Humidity offset test STAT HUMI TEMP TIME TYPE OK, 62.8, 26.5, 5269, 22 OK, 62.8, 26.5, 2, 22 OK, 62.8, 26.5, 2, 22 OK, 62.8, 26.5, 2, 22

  3. Temperature offset test STAT HUMI TEMP TIME TYPE OK, 60.3, 29.0, 5265, 22 OK, 60.3, 29.0, 2, 22 OK, 60.3, 29.0, 2, 22 OK, 60.3, 29.0, 2, 22

  4. LastRead test 60.3, 26.5 60.3, 26.5 60.3, 26.5 60.3, 26.5 60.3, 26.5 actual read 60.3, 26.5 actual read 60.3, 26.5 actual read 60.3, 26.5 actual read 60.3, 26.5 60.3, 26.5 60.3, 26.5 60.3, 26.5 60.3, 26.5 actual read 60.3, 26.5 actual read 60.3, 26.5 actual read 60.3, 26.5 actual read 60.3, 26.5 60.3, 26.5 60.3, 26.5 60.3, 26.5

Done...

Jurand2020 commented 4 years ago

tested test_29 with even longer line, result the same as above

RobTillaart commented 4 years ago

Looks to me like the patch with explicit delays without polling the state of the data pin (line 225 & 234) work quite well. The explicit delays have no polling, and no polling means no glitch is possible.

In the loops that clock in the bits this mechanism is harder to use as a the length of a 0 and 1 bit differs substantially implying that one can delay max for 70% of the time of a 0 (28us) ==> 20 us leaving 8 us of polling for a 0 and 70 us of polling for a 1. In fact that part works robust so no need to chance.

idea Currently the loops that wait for HIGH or LOW have a timeout mechanism based upon decreasing a counter. Although it works very well it is not possible to optimize / tune. So I'm thinking to redesign these polling loops more like this (in pseudo code)

if (waitfor(state, timeout) >= timeout) return SOME_ERROR_CODE;

uint32_t waitfor(state, timeout)
{
  start = micros()
  while (micros() - start < timeout)
  {
    delaymicros(1);             // polling once per usec is enough.
    if (digitalRead(pin) == state) return micros() - start;
  }
  return timeout;
}

returning a boolean is equally effective and slightly simpler

if ( waitfor(state, timeout) ) return SOME_ERROR_CODE;

bool waitfor(state, timeout)
{
  start = micros()
  while (micros() - start < timeout)
  {
    delaymicros(1);
    if (digitalRead(pin) == state) return false;  // no timeout, 
  }
  return true;  // timeout trigger
}

This should work as

Will implement this in a new test branch asap

RobTillaart commented 4 years ago

tested test_29 with even longer line, result the same as above

👍 How long are the longest you tested with? (approx)

RobTillaart commented 4 years ago

pulse_diag result:

Looks perfect, easy to see the bit patterns - I will add this sketch to next release too.

Jurand2020 commented 4 years ago

I've tested with approx. 1m and 10m wires. For me the problem is that pull up does not pull up fast enough and the code confused it with ACK - is that right? So I've made 2 more tests:

  1. just with the 20us delay - works OK
  2. writing pin to HIGH before going to PULLUP - also works OK
Jurand2020 commented 4 years ago

Both tests made on master branch version

Jurand2020 commented 4 years ago

Other solution would be to actively wait for HIGH for e.g. 20us instead of the hardcoded delay and proceed as soon as it fulfilled.

RobTillaart commented 4 years ago

For me the problem is that pull up does not pull up fast enough and the code confused it with ACK - is that right?

That seems to be the case.

The Adafruit library ignores the ack if I recall correctly so that is why it did work. However if there is no ACK the sensor is of line and that won't be detected.

to be continued tomorrow

Jurand2020 commented 4 years ago

hm, grove is not very nice at this, but it sets pin to HIGH before listening:

// now pull it low for ~20 milliseconds pinMode(_pin, OUTPUT); digitalWrite(_pin, LOW); delay(20); //cli(); digitalWrite(_pin, HIGH); delayMicroseconds(40); pinMode(_pin, INPUT);

RobTillaart commented 4 years ago

I have in my code pinMode(_pin, INPUT_PULLUP) which internally does a digitalWrite(pin, HIGH)

but Grove does that 40 us earlier, I can have a look if this fits the datasheet and can be in my code, GIven that the test run with test_29 worked well it is strictly seen not needed

Jurand2020 commented 4 years ago

I tried the following sequece, which also fixed the problem:

// REQUEST SAMPLE - SEND WAKEUP TO SENSOR pinMode(_dataPin, OUTPUT); digitalWrite(_dataPin, LOW); // add 10% extra for timing inaccuracies in sensor. delayMicroseconds(_wakeupDelay * 1100UL);

digitalWrite(_dataPin, HIGH);

// HOST GIVES CONTROL TO SENSOR pinMode(_dataPin, INPUT_PULLUP);

Jurand2020 commented 4 years ago

but Grove does that 40 us earlier, I can have a look if this fits the datasheet and can be in my code, GIven that the test run with test_29 worked well it is strictly seen not needed

That's true. It's only question of timimg. On the scope there can be seen a difference about 2-3us, but additionally the edge is much steeper when setting the pin to high:

RobTillaart commented 4 years ago

The steepness depends on moment of sampling by the scope. Far more important is that the polling for LOW starts a few us later.

I tackle this by a "huge" delayMicroseconds(20) after the PULLUP.


  // REQUEST SAMPLE - SEND WAKEUP TO SENSOR
  pinMode(_dataPin, OUTPUT);
  digitalWrite(_dataPin, LOW);
  // add 10% extra for timing inaccuracies in sensor.
  delayMicroseconds(_wakeupDelay * 1100UL);

  // HOST GIVES CONTROL TO SENSOR
  pinMode(_dataPin, INPUT_PULLUP);

  // DISABLE INTERRUPTS when clock in the bits
  noInterrupts();

  // SENSOR PULLS LOW after 20-40 us  => if stays HIGH ==> device not ready
  delayMicroseconds(20);     // makes sure line is HIGH
  // timeout is 20 us less due to delay() above
  if (_waitFor(LOW, 30)) return DHTLIB_ERROR_SENSOR_NOT_READY;

  // SENSOR STAYS LOW for ~80 us => or TIMEOUT
  if (_waitFor(HIGH, 90)) return DHTLIB_ERROR_TIMEOUT_A;

  // SENSOR STAYS HIGH for ~80 us => or TIMEOUT
  if (_waitFor(LOW, 90)) return DHTLIB_ERROR_TIMEOUT_B;

  // SENSOR HAS NOW SEND ACKNOWLEDGE ON WAKEUP
  // NOW IT SENDS THE BITS

  // READ THE OUTPUT - 40 BITS => 5 BYTES

Will push new version of the test_29 stream soon, busy with final review and add all things to prep new release.

RobTillaart commented 4 years ago

@Jurand2020 pushed new version to branch test_29 - if this works for you I will merge it into the master branch.

main change is timing around the acknowledge, and the handling of timeout is now more explicit.

Furthermore it includes 1) the diagnostics example to visualize the timing of the sensor 2) enhancement in the endless example to print the count per error type.

RobTillaart commented 4 years ago

understanding so far I looked into the source code of pinMode(pin, INPUT_PULLUP) although the AVR version. Setting the pin HIGH is the last action before the return, so when pinMode() is directly followed by a digitalRead(pin) it is very well possible that the long line is not "charged". Especially on the faster ESP the time can be very short.

Doing a digitalWrite(HIGH) before the pinMode() gives just that extra microseconds needed to stabilize. Furthermore the code no explicitly wait 20 us before polling the pin. Also the polling itself is done with a 1 us interval to sample often enough to react fast but not faster.

The investigations described in this issue explain the observed problem very well, although it is not clear for me why the first read did work in the testrun. Need to investigate the boot behavior of the sensor to understand that.

Jurand2020 commented 4 years ago

current version gives: dhtnew_test.ino LIBRARY VERSION: 0.3.3

  1. Type detection test, first run might take longer to determine type STAT HUMI TEMP TIME TYPE Sensor not ready, -999.0, -999.0, 21058, 0 Sensor not ready, -999.0, -999.0, 21024, 0 Sensor not ready, -999.0, -999.0, 21024, 0 Sensor not ready, -999.0, -999.0, 21024, 0

and so on

RobTillaart commented 4 years ago

that is completely unexpected. it takes 21000 to read instead of 5000-ish. Had tested with short wires and that went well... At least a clear failure.

Jurand2020 commented 4 years ago

yes, it seems first _waitFor returned true, but why after over 21000?

RobTillaart commented 4 years ago

A run with small wires here still works... if you have time could we do some interactive testing?

1) can you comment line 204 and retry ? digitalWrite(_dataPin, HIGH); // make dataline HIGH before switch to INPUT

Jurand2020 commented 4 years ago

btw, while ((micros() - start) < timeout) is a little dangerous as micros overflow every ~71 minutes. In bad case it will wait for over hour.. ;)

RobTillaart commented 4 years ago

yes, it seems first _waitFor returned true, but why after over 21000?

analysis the initial run cannot determine type == 0 so it tries as a DHT11 and as a DHT22 and both fail? the wake-up delays add up to 19000 + some extra makes 21000

Jurand2020 commented 4 years ago

line 204 make no difference - still Sensor not ready, -999.0, -999.0, 21012, 0

RobTillaart commented 4 years ago

btw, while ((micros() - start) < timeout) is a little dangerous as micros overflow every ~71 minutes. In bad case it will wait for over hour.. ;)

subtraction prevents problems here

assume start = 2^32 - 10 and micros() = 20 (has wrapped around) then 20 - 2^32 -10 = 20 - 0 -10 = 30

You get problems if you would do while (micros() < (start + timeout))

RobTillaart commented 4 years ago

line 204 make no difference - still Sensor not ready, -999.0, -999.0, 21012, 0

OK
line 211 - delayMicroseconds(20); // makes sure line is HIGH change 20 into 10

line 212 - if (_waitFor(LOW, 30)) return DHTLIB_ERROR_SENSOR_NOT_READY; change 30 into 40