RobTillaart / MAX31855_RT

Arduino library for MAX31855 chip for K type thermocouple
MIT License
17 stars 5 forks source link

Random reads on temps #21

Closed jpsminix closed 2 years ago

jpsminix commented 2 years ago

I have exactly the same problem from "https://github.com/RobTillaart/MAX31855_RT/issues/19". All random readings from internal temp and thermocouple temp BUT ok with bitflags control.

photo_2021-12-05_11-57-50 photo_2021-12-05_11-54-54

NOTE: On this image, there is a flag, but when i say it give random temp data all flags is 0.

NodeMCU ESP32 <SPI(sw) 8cm wire lenght> MAX31855 (clone from ebay and cut T- and GND) Arduino UNO <SPI(sw) 8cm wire lenght> MAX31855 (clone from ebay and cut T- and GND)

I have oscilloscope to check SPI readings (software) and i see good comms. I have tested ok bit flags from erros from the thermocouple (SHORT to GND, SHORT to VCC, no thermocouple). When i see the bits from rawdata and check with oscilloscope, the data match.

ESP32 dont work for me with SPI HW on VSPI nor HSPI.

Oh, i have tested with 3 clones from ebay, and 1 from aliexpress. All same problem.

I have questions:

Thank you for your time :)

RobTillaart commented 2 years ago

@jpsminix I see only one explicit question Without thermocouple connected, the internal temp still works?

The answer is I never tried, as I have no hardware setup / time, I cannot test it.

From your description I understand that

software SPI is working for UNO software SPI is working for ESP32 hardware SPI is working for UNO hardware SPI is failing for ESP32 (both HSPI and VSPI) Can you confirm?

(it will be at least the weekend before I have time to dive into this.

jpsminix commented 2 years ago

Hi, thank you to much for helping me

software SPI is working for UNO software SPI is working for ESP32 hardware SPI is failing for UNO (sorry for not write this before) hardware SPI is failing for ESP32 (both HSPI and VSPI)

I have same setup on arduino for sw and hw SPI:

const int csPin   = 10;
const int clkPin  = 13;
const int dataPin = 12;

uint32_t start, stop;

MAX31855 tc(csPin);
//MAX31855 tc(clkPin, csPin, dataPin);  // sw SPI

And only change the last line to this to make sw SPI working:

//MAX31855 tc(csPin);
MAX31855 tc(clkPin, csPin, dataPin);  // sw SPI

I put some log with hw SPI: 16:51:59.675 -> time: 480 16:51:59.675 -> stat: 129 16:51:59.675 -> raw: 1111 1111 1111 1111 1111 1111 1111 1111 16:51:59.721 -> internal: 0.000 16:51:59.721 -> temperature: nan 16:52:00.691 -> 16:52:00.691 -> time: 484 16:52:00.691 -> stat: 129 16:52:00.691 -> raw: 1111 1111 1111 1111 1111 1111 1111 1111 16:52:00.691 -> internal: 0.000 16:52:00.691 -> temperature: nan 16:52:01.705 ->

I put some log with sw SPI: 16:52:42.680 -> time: 508 16:52:42.680 -> stat: 0 16:52:42.680 -> raw: 0111 1000 0011 1000 1001 1101 1101 0000 16:52:42.726 -> internal: -98.188 16:52:42.726 -> temperature: 1923.500 16:52:43.699 -> 16:52:43.699 -> time: 508 16:52:43.699 -> stat: 0 16:52:43.699 -> raw: 0111 1100 0011 1100 1001 1101 1100 0000 16:52:43.699 -> internal: -98.250 16:52:43.699 -> temperature: 1987.750

I have a pull up resistor ~2k between 5V and DO (MISO).

MAX31855 is connected to 5V (has a regulator to 3.3v the board).

This is a capture of oscilloscope of Arduino UNO, sw SPI, (pull up resistor 2k only on MISO) and MAX31855 (with LDO to 3.3v): Captura de pantalla 2021-12-07 165416

(added syntax highlighting in code part, chose edit to see how)

RobTillaart commented 2 years ago

OK, which time zone are you in EU or USA?
Helps to know the moments we are both online, I am from the Netherlands (EU).

the first raw: 1111 1111 1111 1111 1111 1111 1111 1111 is definitely wrong

The SW SPI values are correct ? (just asking as 1923° seems quite high and -98° seems very low.

16:52:42.680 -> raw: 0111 1000 0011 1000 1001 1101 1101 0000
16:52:42.726 -> internal: -98.188
16:52:42.726 -> temperature: 1923.500
jpsminix commented 2 years ago

I am from Spain (EU).

The first raw it is beacuse hw SPI its not working (i dont see anything on the pins, CS/CLK/DO).

The SW SPI the values ar not correct, the are completely random. Internal temp can jump from -98 to 127 and then go to -127. Same with thermocouple. BUT the flags bits are correct. If i short the pins near GND to T- and VCC to T+, the flags are correct.

There is a big posibility that 4 MAX31855 that i have with same results maybe are faulty (all clones). But i would like to make it work with hw SPI.. maybe sw SPI is slow... (i dont think so because with sw SPI and ESP32 its pretty fast)... but i dont know what to check.

RobTillaart commented 2 years ago

Have you tried e.g. Adafruits library? Might help to check if it is in the hardware / wiring or in the software. If I find some time tomorrow I will setup a test environment, I hope I have all the HW needed.

jpsminix commented 2 years ago

I just installed now the adafruit MAX31855 v1.3 and configured for HW SPI with CS=10 with all same wires (nothing changed)

// Example creating a thermocouple instance with software SPI on any three
// digital IO pins.
//#define MAXDO   3
//#define MAXCS   4
//#define MAXCLK  5

// initialize the Thermocouple
//Adafruit_MAX31855 thermocouple(MAXCLK, MAXCS, MAXDO);

// Example creating a thermocouple instance with hardware SPI
// on a given CS pin.
#define MAXCS   10
Adafruit_MAX31855 thermocouple(MAXCS);

// Example creating a thermocouple instance with hardware SPI
// on SPI1 using specified CS pin.
//#define MAXCS   10
//Adafruit_MAX31855 thermocouple(MAXCS, SPI1);

The logs: 12:18:31.024 -> Internal Temp = 127.94 12:18:31.024 -> C = 639.00 12:18:31.992 -> Internal Temp = 127.94 12:18:32.038 -> C = 639.00 12:18:33.009 -> Internal Temp = 127.94 12:18:33.055 -> C = 642.25 12:18:34.024 -> Internal Temp = 127.94 12:18:34.024 -> C = 580.50 12:18:34.997 -> Internal Temp = 127.94 12:18:35.043 -> C = 646.50 12:18:36.013 -> Internal Temp = 127.94 12:18:36.059 -> C = 646.50

The oscilloscope view.. this confuse me a little (so much space on end of DO) Captura de pantalla 2021-12-08 122502

This is the same situation as before but with short T- to GND. Captura de pantalla 2021-12-08 122732

I think internal temp is broken on all my chips and its impossible to calculate the external temp without the reference of internal.

Thank you so much!

RobTillaart commented 2 years ago

Is it correct that I to see that the clock is 5V and the dataline is 3V3 ?

RobTillaart commented 2 years ago

I think internal temp is broken on all my chips and its impossible to calculate the external temp without the reference of internal.

I need to dive into the datasheet / code to confirm that

RobTillaart commented 2 years ago

Maybe your sensor batch does not work at full frequency. I've heard such stories for other sensors before, that 2nd choice batches are sold cheaply.

(thinking out loud) In my library you can patch the software timing by adding delayMicroseconds search for // delayMicroseconds(1); // DUE and in a similar way you can patch the hardware handshake too. try e.g. 10 us per pulse. Absolutely no guarantee it will work, at least it could be tested in a few minutes if the sensor works on lower frequency.

RobTillaart commented 2 years ago

The gap in the scope picture might be an artefact of the library used. Then you should see it after every data exchange.

jpsminix commented 2 years ago

Is it correct that I to see that the clock is 5V and the dataline is 3V3 ?

Yes, you are correct but i dont know why, the pull up resistor is on the pin of the DO.

schematic uno-max31855 . Another difference is that CLK line start from LOW with adafruit (hw SPI)

jpsminix commented 2 years ago

Maybe your sensor batch does not work at full frequency. I've heard such stories for other sensors before, that 2nd choice batches are sold cheaply.

(thinking out loud) In my library you can patch the software timing by adding delayMicroseconds search for // delayMicroseconds(1); // DUE and in a similar way you can patch the hardware handshake too. try e.g. 10 us per pulse. Absolutely no guarantee it will work, at least it could be tested in a few minutes if the sensor works on lower frequency.

Im gonna test it. I will update with news.

jpsminix commented 2 years ago

I have changed the file MAX31855.cpp with this modification:

  else  // Software SPI
  {
    digitalWrite(_select, LOW);
    for (int8_t i = 31; i >= 0; i--)
    {
      _rawData <<= 1;
      digitalWrite(_clock, LOW);
      delayMicroseconds(10);   // DUE MODIFIED
      if ( digitalRead(_miso) ) _rawData++;
      digitalWrite(_clock, HIGH);
      delayMicroseconds(10);  // DUE MODIFIED
    }
    digitalWrite(_select, HIGH);
  }

With this modification i see that the Freq with sw SPI has changed from ~70khz to ~32Khz with same result. Data log is a little more stable on external temp but internal temp is off: 14:27:32.386 -> time: 1056 14:27:32.386 -> stat: 0 14:27:32.386 -> raw: 0011 0100 0100 0000 1000 0000 0000 0000 14:27:32.386 -> internal: -128.000 14:27:32.386 -> temperature: 836.000 14:27:33.360 -> 14:27:33.360 -> time: 1056 14:27:33.360 -> stat: 0 14:27:33.360 -> raw: 0011 0100 0010 1000 1000 0000 0000 0000 14:27:33.360 -> internal: -128.000 14:27:33.360 -> temperature: 834.500

jpsminix commented 2 years ago

My ambiental temp is about 19~24º and all test is on ambiental temp.

RobTillaart commented 2 years ago

I have tried some setup (no pull up resistors, very short wires)

UNO + SW SPI works  (max31855_sw_SPI.ino)
time:       540
stat:       0
raw:        0000 0001 0011 1100 0001 0011 1110 0000
internal:   19.875
temperature:    19.750

ESP + SW SPI  works  (max31855_sw_SPI.ino)
time:       17
stat:       0
raw:        0000 0001 0100 0100 0001 0011 1011 0000
internal:   19.687
temperature:    20.250

ESP + HW SPI  fails 
time:       15
stat:       129
raw:        1111 1111 1111 1111 1111 1111 1111 1111
internal:   0.000
temperature:    nan

conclusion 1: my module is working with SW handshake conclusion 2: similar problems with HW handshake

I hope I have some time this evening to investigate further.

RobTillaart commented 2 years ago

Found a problem when testing with UNO. I noticed that the constructor that sets the hardware flag does not get executed (anymore?). When moving the code from the constructor to begin HW SPI starts to work again at least on UNO.

todo

RobTillaart commented 2 years ago

Fix needs a rewrite of the constructor and begin() function and will be a breaking change. Might take some days to fix (+ test)

RobTillaart commented 2 years ago

@jpsminix Created a branch https://github.com/RobTillaart/MAX31855_RT/tree/fix21 that should solve the issue. Can you please test if this works with UNO SW HW and ESP32 SW and HW?

Note: In your sketch you need to move the parameters from the constructor to the begin().


update: unit tests needs investigation update 2: unit tests fixed, pull request is ready for merging.

RobTillaart commented 2 years ago

ESP32 via VSPI (240MHz)

time:       15
stat:       0
raw:        0000 0001 0100 1000 0001 0011 1101 0000
internal:   19.812
temperature:    20.500

note that SW SPI is almost equally fast 17 vs 15 us. - that is ~2 bits / usec

good moment to stop :)

jpsminix commented 2 years ago

@jpsminix Created a branch https://github.com/RobTillaart/MAX31855_RT/tree/fix21 that should solve the issue. Can you please test if this works with UNO SW HW and ESP32 SW and HW?

Note: In your sketch you need to move the parameters from the constructor to the begin().

update: unit tests needs investigation update 2: unit tests fixed, pull request is ready for merging.

I have tested Arduino UNO HW and ok. Tomorrow i will test the rest:

RobTillaart commented 2 years ago

update on an open question:

I think internal temp is broken on all my chips and its impossible to calculate the external temp without the reference of internal.

I need to dive into the datasheet / code to confirm that

From datasheet:

The device senses and corrects for the changes in the reference junction temperature with cold-junction compensation. It does this by first measuring its internal die temperature, which should be held at the same temperature as the reference junction. It then measures the voltage from the thermocouple’s output at the reference junction and converts this to the noncompensated thermocouple temperature value. This value is then added to the device’s die temperature to calculate the thermocouple’s “hot junction” temperature. Note that the “hot junction” temperature can be lower than the cold junction (or reference junction) temperature.

So yes, the internal temperature is needed as a reference. What I understand from the above is that the external sensor follows the internal sensor (as the external is added). So in theory the two temperatures should follow same pattern ? However if internal is broken anything is possible in practice.

jpsminix commented 2 years ago

Arduino UNO sw and hw SPI -> OK

ESP32 hw SPI i dont see anything on the scope. I put logs: 17:07:44.881 -> time: 51 17:07:44.881 -> stat: 0 17:07:44.881 -> raw: 0000 0000 0000 0000 0000 0000 0000 0000 17:07:44.881 -> internal: 0.000 17:07:44.881 -> temperature: 0.000 17:07:45.894 -> 17:07:45.894 -> time: 51 17:07:45.894 -> stat: 0 17:07:45.894 -> raw: 0000 0000 0000 0000 0000 0000 0000 0000 17:07:45.894 -> internal: 0.000 17:07:45.894 -> temperature: 0.000

The next test is bad, i have put bad the wirings.

ESP32 sw SPI i see througt scope that is working but outputs 0. I put logs:
17:05:34.316 -> time:       16
17:05:34.316 -> stat:       0
17:05:34.316 -> raw:        0000 0000 0000 0000 0000 0000 0000 0000
17:05:34.316 -> internal:   0.000
17:05:34.316 -> temperature:    0.000
17:05:35.286 -> 
17:05:35.286 -> time:       16
17:05:35.286 -> stat:       0
17:05:35.286 -> raw:        0000 0000 0000 0000 0000 0000 0000 0000
17:05:35.332 -> internal:   0.000
17:05:35.332 -> temperature:    0.000

Screenshot from scope (ESP32 sw SPI with 2 pullup resistors on CLK and DO):
![Captura de pantalla 2021-12-09 170553](https://user-images.githubusercontent.com/4179434/145432306-ebb4521b-a53c-473b-8d39-e8f070afa80e.jpg)
jpsminix commented 2 years ago

Sorry, bad wiring on esp32 testing. I have corrected the pinout wiring and ESP32 with sw SPI -> OK. Im gonna test ESP32 hw SPI now.

RobTillaart commented 2 years ago

Sorry, bad wiring on esp32 testing. I have corrected the pinout wiring and ESP32 with sw SPI -> OK. Im gonna test ESP32 hw SPI now.

such 'errors' happens to me daily :)

jpsminix commented 2 years ago

No data with ESP 32 hw SPI and 2 resistor pull up (~2k) on CLK and DO lines. I dont see anything on the scope. Log output: 17:18:30.257 -> time: 51 17:18:30.257 -> stat: 0 17:18:30.257 -> raw: 0000 0000 0000 0000 0000 0000 0000 0000 17:18:30.303 -> internal: 0.000 17:18:30.303 -> temperature: 0.000 17:18:31.272 -> 17:18:31.272 -> time: 51 17:18:31.272 -> stat: 0 17:18:31.272 -> raw: 0000 0000 0000 0000 0000 0000 0000 0000 17:18:31.272 -> internal: 0.000 17:18:31.272 -> temperature: 0.000

jpsminix commented 2 years ago

Wops, i dont have choose demo VPSI. i am gonna check.

jpsminix commented 2 years ago

Yep, that was de VPSI command. My bad.

Ok ESP32 hw SPI working (with and without resistor pullup). Logs: 17:23:49.640 -> time: 16 17:23:49.640 -> stat: 0 17:23:49.640 -> raw: 0101 0100 1001 1000 1011 0011 1101 0000 17:23:49.640 -> internal: -76.188 17:23:49.640 -> temperature: 1353.500 17:23:50.660 -> 17:23:50.660 -> time: 16 17:23:50.660 -> stat: 0 17:23:50.660 -> raw: 0100 1111 1111 1000 1011 0110 0001 0000 17:23:50.660 -> internal: -73.938 17:23:50.660 -> temperature: 1279.500

The scope its a little strange, but works (15.6Mhz): Captura de pantalla 2021-12-09 172347

jpsminix commented 2 years ago

ESP32 via VSPI (240MHz)

time:     15
stat:     0
raw:      0000 0001 0100 1000 0001 0011 1101 0000
internal: 19.812
temperature:  20.500

note that SW SPI is almost equally fast 17 vs 15 us. - that is ~2 bits / usec

good moment to stop :)

If you disconnect the thermocouple wiring, is the MAX31855 still gives you the readings of internal temp?

RobTillaart commented 2 years ago

it looks like you got it working!

If you disconnect the thermocouple wiring, is the MAX31855 still gives you the readings of internal temp?

I will try, will take few minutes

RobTillaart commented 2 years ago

I removed the wire and immediately

time:           15
stat:           1
raw:            0111 1111 1111 1101 0001 0111 0001 0001
internal:       0.000
temperature:    0.000

So it looks like the raw data is not completely "crap" I need to check if the read "returns" prematurely.

RobTillaart commented 2 years ago

made a patch in the .cpp and it shows a nice internal and a faulty external ==> bit 18 -31 are all 1

time:       16
stat:       1
raw:        0111 1111 1111 1101 0001 0011 1011 0001
internal:   19.687
temperature:    2047.750
RobTillaart commented 2 years ago

(thinkin out loud) That implies including this patch in the Pull Request gives better information in case of an STATUS_OPEN_CIRCUIT (1) error. Agree?

RobTillaart commented 2 years ago

@jpsminix I removed the "premature" return from read() so it now it will always fill internal and temperature from the raw data. Can you verify?

RobTillaart commented 2 years ago

The scope its a little strange, but works (15.6Mhz): Captura de pantalla 2021-12-09 172347

looks similar to what I saw on my scope yesterday. The ESP32 the SW SPI had no time between the bytes, while HW SPI had faster bursts like your picture shows. So hardware SPI is in theory maybe twice(++) as fast, but the overhead of transaction exists. In practice that means pulses are shorter and probably not a nice square ware, but more a "rounded shark fin". So for stability SW SPI should be preferred.

jpsminix commented 2 years ago

(thinkin out loud) That implies including this patch in the Pull Request gives better information in case of an STATUS_OPEN_CIRCUIT (1) error. Agree?

Complety agree!

jpsminix commented 2 years ago

@jpsminix I removed the "premature" return from read() so it now it will always fill internal and temperature from the raw data. Can you verify?

I will try on a while. I will tell you.

jpsminix commented 2 years ago

Ok, its working (my chip is giving random numbers but works your code).

jpsminix commented 2 years ago

The scope its a little strange, but works (15.6Mhz): Captura de pantalla 2021-12-09 172347

looks similar to what I saw on my scope yesterday. The ESP32 the SW SPI had no time between the bytes, while HW SPI had faster bursts like your picture shows. So hardware SPI is in theory maybe twice(++) as fast, but the overhead of transaction exists. In practice that means pulses are shorter and probably not a nice square ware, but more a "rounded shark fin". So for stability SW SPI should be preferred.

Correct!

RobTillaart commented 2 years ago

I'm going to merge the Pull Request, thank you for the issue and your testing

RobTillaart commented 2 years ago

Issue is automatically closed by PR, so if there are new problems please open a new issue (or reopen if related)

RobTillaart commented 2 years ago

@jpsminix I am thinking about adding a configurable delay in the SW SPI loop so the timing can be adjusted. See #23 opinion?