PaxInstruments / t400-firmware

Firmware for the Pax Instruments T400 temperature datalogger
22 stars 5 forks source link

Junction temperature compensation #196

Closed charlespax closed 8 years ago

charlespax commented 8 years ago

To properly calculate the thermocouple temperature we need to use the following equation.

V_compensated = V_thermocouple + V_reference

where V_thermocouple is the voltage across the thermocouple at the reference junction, and V_reference is the voltage in the thermocouple lookup table corresponding to the measured junction temperature.

We then use GetTypKTemp(V_compensated) to compute the temperature at the thermocouple tip.

V_reference is implemented in GetJunctionVoltage(double jTemp), which takes an as input a double and returns an int32_t. The array index i should be computed from the junction temperature, but there seems to be a problem. When i=29 is hard coded in the firmware the TC array gives the correct value, which is then passed on and everything works. However, when i is computed from the junction temperature, the array seems to return a crazy value, which is passed on and displayed to the LCD as 12336.

Setting DEBUG_JUNCTION_TEMPERATURE 1 in t400.h will enable code that should return the correct voltage from the lookup table. However, there appears to be an issue with variable types.

Here is the code:

int32_t GetJunctionVoltage(double jTemp) {
  // TODO use lookup table to determine the thermocouple voltage that corresponds
  // to the junction temperature.

  int32_t jVoltage = 0;
  int i = 0;

  i = jTemp/10 + 27; // If ambient temperature is around 25C, this givest i = 29
  i=29; // ****** BUG BUG BUG Manually declare value BUG BUG BUG ******
        // Commenting this out while debugging the LCD will display '12336'
        // If this line is commented out while DEBUG_JUNCTION_TEMPERATURE is enabled and
        // the LCD displays 7256, this function should work properly.

#if DEBUG_JUNCTION_TEMPERATURE
  jVoltage = tempTypK[i]; // Should always give jVoltage as 7256 on the LCD
#else
  jVoltage = tempTypK[i] - TK_OFFSET + (jTemp - (i*10-270)) * (tempTypK[i+1] - tempTypK[i])/10;
#endif

//  return i; // Displays '29.0' on the LCD as expected
//  return tempTypK[29]; // Displays '7256.0'on LCD
  return jVoltage;
}
TomKeddie commented 8 years ago

Charles, passing a double like that will use a lot of stack space. Try passing a pointer. When weird things happen, stack overrun is a good place to start.

int32_t GetJunctionVoltage(double* jTemp) {

. . i = *jTemp/10 + 27;

TomKeddie commented 8 years ago

Also watch out for precision issues in this equation. I don't have anything concrete but there is a lot of math here that might saturate in inexpected places. You seem like a careful guy so perhaps you've prototyped this with a spreadsheet or something.

jVoltage = tempTypK[i] - TK_OFFSET + (jTemp - (i10-270)) \ (tempTypK[i+1] - tempTypK[i])/10;

You might want to precalculate some of it with higher precision.

isonno commented 8 years ago

You use both int32_t and int. Is int 16 or 32 bits in your compiler? You might try defining i as int32_t.

charlespax commented 8 years ago

Firstly, my ssh keys were messed up and my last four commits were not in the repo. They are now. Sorry about that. My mistake.

Pointers @TomKeddie I tried using the pointer method as you describe.

Using Arduino 1.6.7. Code from commit c48e8038c0e5ed4646fc5e52f3d556dfd505f482 on Sun Dec 27 01:47:33 2015 +0800.

In t400.h I change line 8 to enable debugging #define DEBUG_JUNCTION_TEMPERATURE 1.

Code compiles to 28,430 bytes (99%) of program storage space. Maximum is 28,672 bytes. Global variables use 1,886 bytes (73%) of dynamic memory, leaving 674 bytes for local variables. Maximum is 2,560 bytes.

After changing to the pointer method code compiles to 28,422 bytes (99%) of program storage space. Global variables use 1,886 bytes (73%) of dynamic memory, leaving 674 bytes for local variables. Maximum is 2,560 bytes.

There is, unfortunately, no change in behavior. Commenting out line 331 ("i=29;") in functions.cpp still gives 12336 while leaving it commented out gives the correct value of 7256. We do save a few flash bytes, so I'll keep these changes. This change is in commit fb9dc2a1c2006b7b3b8905744faf52428d59e29d.

I should try having GetJunctionVoltage() return a pointer to the data rather than the data itself.

Stack overflow When i is calculated on functions.cpp line 330 i = *jTemp/10 + 27; I get the value i is 29 and I get the wrong value from the lookup table. But uncommenting line 331 i=29; gives the correct lookup table value. Would making that one change affect the stack size?

Precision I'll take a closer look at that equation and double check everything. For debugging purposes in GetJunctionVoltage() I just return the data from the lookup table, so the equation doesn't affect this bug.

#if DEBUG_JUNCTION_TEMPERATURE
  jVoltage = tempTypK[i]; // Should always give jVoltage as 7256 on the LCD
#else
  jVoltage = tempTypK[i] - TK_OFFSET + (jTemp - (i*10-270)) * (tempTypK[i+1] - tempTypK[i])/10;
#endif
isonno commented 8 years ago

Would making that one change affect the stack size?

It easily could on a small micro; it'll call subroutines to do the floating point and numeric conversion work if it doesn't have onboard FP hardware. It may pay to peek at the assembly code for that function.

charlespax commented 8 years ago

From further poking around I get the sense that this is more of a memory management issue than a variable type issue.

I wrote a little more and committed in commit 3e99822bf405a43505a9dd5ec8c7864eef140a8a. This works, but only for a limited junction temperature range. Here is the function as it appears int he repo:

int32_t GetJunctionVoltage(double* jTemp) {
  // TODO use lookup table to determine the thermocouple voltage that corresponds
  // to the junction temperature.

  int32_t jVoltage = 0;
  uint8_t i = 0;

  i = *jTemp/10 + 27; // If ambient temperature is around 25C, this givest i = 29
 // i=29; // ****** BUG BUG BUG Manually declare value BUG BUG BUG ******
        // Commenting this out while debugging the LCD will display '12336'
        // If this line is commented out while DEBUG_JUNCTION_TEMPERATURE is enabled and
        // the LCD displays 7256, this function should work properly.

#if DEBUG_JUNCTION_TEMPERATURE
  for(uint8_t j=28;  j <= i && j <=31; j++){
  jVoltage = tempTypK[j]; // Should always give jVoltage as 6855, 7256, 7661, or 8070 on the LCD
  }
#else
  jVoltage = tempTypK[i] - TK_OFFSET + (*jTemp - (i*10-270)) * (tempTypK[i+1] - tempTypK[i])/10;
#endif

//  return i; // Displays '29.0' on the LCD as expected
//  return tempTypK[29]; // Displays '7256.0'on LCD
  return jVoltage;
}

We can clean this up by removing some comments and the parts that don't happen while debugging.

int32_t GetJunctionVoltage(double* jTemp) {
  int32_t jVoltage = 0;
  uint8_t i = 0;

  i = *jTemp/10 + 27;

  for(uint8_t j=28;  j <= i && j <=31; j++){
    jVoltage = tempTypK[j]; // Should always give jVoltage as 6855, 7256, 7661, or 8070 on the LCD
  }
  return jVoltage;
}

Changing the contents of the for loop conditions will give either good or bad results. I will create a list below. The room temperature was between 28 and 29 C, which out of pure coincidence puts at i=29, so the correct returned value is 7256. for(uint8_t j=26; j <= i && j <=29; j++) good 7256 for(uint8_t j=26; j <= i && j <=30; j++) bad 12336 for(uint8_t j=27; j <= i && j <=29; j++) good 7256 for(uint8_t j=27; j <= i && j <=30; j++) good 7256 for(uint8_t j=27; j <= i && j <=31; j++) bad 12336 for(uint8_t j=28; j <= i && j <=29; j++) good 7256 for(uint8_t j=28; j <= i && j <=30; j++) good 7256 for(uint8_t j=28; j <= i && j <=31; j++) good 7256 for(uint8_t j=28; j <= i && j <=32; j++) bad 12336 for(uint8_t j=29; j <= i && j <=29; j++) good 7256 for(uint8_t j=29; j <= i && j <=30; j++) good 7256 for(uint8_t j=29; j <= i && j <=31; j++) good 7256 for(uint8_t j=29; j <= i && j <=32; j++) good 7256 for(uint8_t j=29; j <= i && j <=33; j++) bad 12336

I can do this again at an ambient temperature of just above 30 C. This would give us i=30 and 7661 would be the correct returned value.

add data here

From the above data I believe the for loop breaks when j == i rather than i being a large number and the for loop breaking when it reaches the condition j <=31 (of 30 etc.). I don't know what's going on, but maybe something is overflowing this loops through more than three times. But that's weird because the for loop seems to be breaking on the j==i condition.

If we remove the j<=31 condition the wrong value is always returned.

for(uint8_t j=27; j <= i; j++) bad 12336 for(uint8_t j=28; j <= i; j++) bad 12336 for(uint8_t j=29; j <= i; j++) bad 12336

This is some weird shit and certainly the last time I make anything without proper hardware debugging support.

charlespax commented 8 years ago

@isonno Looking at the assembly is a bit beyond my abilities for now. I'll assume it's a heap issue and try to free up some ram.

I put some things in ROGMEMusingbthe F() macro, but that didn't make a difference. It may not have been enough data to make up for including the progmem code.

I'll also disable some big data like the SD writing, serial output, or LCD features. Then I'll have some headroom.

charlespax commented 8 years ago

I removed a bunch of MicroSD writing operations and serial print operations to reduce the flash from 28,490 to 27,810; 96% of the maximum 28,672 bytes. The RAM reduced from 1,886 to 1,834; 71% of the maximum 2,560 bytes. I still get identical behavior in the for the for loop as described above.

I removed two of the temperature channels and reduced flash to 28,016 bytes and RAM to 1,664 bytes. Again, the same behavior. Truncated the lookup table: 27,764 flash, 1,664 RAM, same behavior.

Then to took the lookup table out of PROGMEM. Changed typek_constant.h from const uint16_t tempTypK[TEMP_TYPE_K_LENGTH] PROGMEM =... to const uint16_t tempTypK[TEMP_TYPE_K_LENGTH] =... and got different behavior. Wow! I was able to get the correct returned value of 7256 from the for loop for(uint8_t j=29; j <= i && j<= 33; j++), which previously gave 12336. Even for(uint8_t j=0; j <= i; j++) works correctly without PROGMEM. I can now dispense with the for loop all together and use what I wanted in the beginning:

int32_t GetJunctionVoltage(double* jTemp) {
  int32_t jVoltage = 0;
  uint8_t i = 0;

  i = *jTemp/10 + 27;
  jVoltage = tempTypK[i] - TK_OFFSET + (*jTemp - (i*10-270)) * (tempTypK[i+1] - tempTypK[i])/10;

  return jVoltage;
}

Using PROGMEM for the lookup table saves 330 bytes of RAM. I'm not sure if I can find the extra space somewhere else, but this is progress. At least we have a hint on where this bug might be.

Someone else had the same problem. It turned out to be that "to access memory that has been defined in flash by PROGMEM, you have to use special methods. The reference page gives us this example:myChar = pgm_read_byte_near(signMessage + k);" (link). It looks like the solution lies down this path. I'll check it out further tomorrow. Feeling pretty good about it.

charlespax commented 8 years ago

Code in commit 70673bbc60831bc70b7c36566ca4cfa251d45df0 appears to work correctly. The firmware now reads from PROGMEM correctly. We still have 40 bytes available in flash in 674 in RAM.

Thanks for all your help @TomKeddie and @isonno :-)