PubInv / moonrat

Moonrat: A second-generation portable incubator
GNU Affero General Public License v3.0
3 stars 3 forks source link

Severe Electrical Flakiness, TMP36 changes reading when I2C connected to display. #303

Open RobertLRead opened 3 months ago

RobertLRead commented 3 months ago

I am still investigating, but it is clear we have a severe electrical problem with our temperature read.

In the first place, the code seems to presuppose we are using a TMP37, but we are really using a TMP36. When used in fully assembled mode, the Moonrat temperature readings very by as much as 20 degrees C. This is directly caused by the voltage reads on Pin A0 fluctuating.

However, after disassembling the board and connecting only the GND, 5V, 3.3V, and A5 lines, the temperature sensor does NOT fluctuate, but it relatively steady. It reacts to increase in temperature (generating by me placing my thumb on the sensor) by a direct increase in voltage represent about 5 degrees as it should.

However, the simple act of unplugging and plugging the I2C connections into the dissasembled board (that is,our PCB not mounted onto an R4 minima, but placed beside it and individaully wired, causes a change in the voltage level from approximate 163 to 173. Each digital tick is 0.5 millivolts, so this represents 50 millivolts, which on a TMP36 would correspond to 5 degrees C change in reported voltage. In the image below you can see the change in the "reading" which is a direct reading of the analog input pin (no match or conversion applied.)

Screenshot 2024-08-07 at 10 02 05 AM

I do not yet understand this, but it seems clear that the Display (which is our I2C element) is drawing too much current or produce an unexpectedly high capacitance on the line or doing something else naughty at an electrical level. I will conti

RobertLRead commented 3 months ago

My Fluke 87 V doe not observe a difference in the voltage level of the output of the TMP36 sensor as the SCL and SDA wires are plugged in and unplugged. This suggest to me that the voltage of the sensor is not changing, but somehow the analog to Digital converter is failing, possible due to a lowered voltage as the I2C instrument draws power.

RobertLRead commented 3 months ago

Note: This is just a silly note: when debugging by placing the PCB shield unstacked from the Arduino R4. This needs the to have A4 and A5 to be connected, apparently because these are overloaded to be SCL/SDA.

RobertLRead commented 3 months ago

After connecting the A4 and A5 pins, D5, D6, and D7, and the SDA and SCL pins, as well as 5V and GND (with the shield not plugged into the system, incredibly the system now seems stable. When using a cell phone next to it I perceived some minor interference in the case of a lowered voltage value which was temporary. I cannot explain why it is working better now than previously.

RobertLRead commented 3 months ago

After putting the shield back on, the erratic nature of the temperature readings has come back. In the output below, the raw readings oscillate between 140 and 181!!!

I have no explanation for this. Possibly the proximity of the shield to the mimina itself casues a problem. Possibly capacitance added by the wires (which i remove when I stack the shield) makes it behave better.

I am very confused by this.

Frequency Output: 165.00

reading: 146 volts: 0.71 21.29 degrees C filter: 21.60

Frequency Output: 170.00

reading: 146 volts: 0.71 21.29 degrees C filter: 21.59

Frequency Output: 170.00

reading: 142 volts: 0.69 19.34 degrees C filter: 21.55

Frequency Output: 169.00

reading: 140 volts: 0.68 18.36 degrees C filter: 21.50

Frequency Output: 169.00

ForrestErickson commented 3 months ago

Capture of test conditions

Please capture the exact serial number of the Moonrat II hardware under test. Please capture the exact firmware with which this test was made, for each test. Please make a photograph of the setup(s) where you can reconstructing it (them). A photograph to capture what you mean by shield on and shield off would help.

Which Analog Input

Regarding, "However, after disassembling the board and connecting only the GND, 5V, 3.3V, and A5 lines, the temperature sensor does NOT fluctuate, " The A5 line is not the ADC for the TMP36 and so cannot work to measure temperature. Why is A5 involved?

RobertLRead commented 3 months ago

I should have said A0. The temperature is read from A0.

There is no serial number on my board. There is a black smudge of ink that appears to have once been a serial number; it comes off when I rub it with my thumb.

I am now adding a set of facts which may be unrelated but make no sense.

  1. When I plugged the shield in, the values A0 voltage values where wildly erratic, differing by 40%.
  2. When I wire all of the wires (except AREF , 3.3V, IOREF and BOOT, the system appears to be stable and gives approximately the correct valued.s
  3. There are occasionally high fequency noise on the A0 line, as shown by the image below. These happen about once per second, which is the frequency which which we go through the loop and is likely the A0 read frequency. This is the cyan colored trace below. (Note voltage scales of traces differ, this is about 0.77 volts, which is approximate 27 Degree C, which is approximately the temperature of my office.
  4. DS1Z_QuickPrint2

  5. Plugging capacitors of various sizes between ground and A0 does decrease the amplitude of this noise. However, I have no reason to believe this noise is effecting the temperature read.

The biggist unexplained problem right now is that when the shield is plugged in it is wildly erratic and yet when it is wired by hand it is stable. I have no explanation for this. I have moved the shield on top of the R4 (to the extent the wires allow me to) to see if physical proximity makes a difference and it does not appear to make a discernible difference.

RobertLRead commented 3 months ago

Here are two photos showing the shield off the R4, completely wired with jumper wires, which seems to have a stable read of the TMP36 voltage on A0.

IMG_5140 IMG_5139

RobertLRead commented 3 months ago

Note: When on a call with Lee Erickson using my Iphone while it was using Bluetooth hearing aids, it caused the A0 inputs to drop to nearly zero, as shown by the blue line here: DS1Z_QuickPrint3

When NOT using Bluetooth, but on the phone, it produced a different but similarly interfered trace:

DS1Z_QuickPrint4

Here is a baseline with no use of cellphone as you suggested, @ForrestErickson DS1Z_QuickPrint6

ForrestErickson commented 3 months ago

@RobertLRead Regarding,

Note: When on a call with Lee Erickson using my Iphone while it was using Bluetooth hearing aids, it caused the A0 inputs to drop to nearly zero, as shown by the blue line here:

and

When NOT using Bluetooth, but on the phone, it produced a different but similarly interfered trace:

Please add a base line oscilloscope graph to the above, with NO cellphone and NO bluetooth connection for the same setup. THIS WAS DONE, see above.

ForrestErickson commented 3 months ago

Firmware During Robert's Test of 20240808

https://github.com/PubInv/moonrat/blob/main/moonratII/firmware_moonratII/production_R4/MoonTestV5/MoonTestV5.ino

ForrestErickson commented 3 months ago

@RobertLRead

If time is critical for your trip, why are you using the R4? Use an R3. for which IIRC, @HJGV05 reported different and better behavior.

RobertLRead commented 3 months ago

@ForrestErickson I think it is worth trying quickly to see if it solves some of these problems. When I began I was using an R3 (though I think it was knock-off imitation that was sent to me) and had the same problems. I am resistant to not solving this now that you have given me such good advice, but it may indeed be worth trying.

RobertLRead commented 3 months ago

Note: When I added a debugging statement which STOPPED writing to the OLED display during the loop, the obvious oscillation in voltage in both the 5V line and A0 line disappeared (see image below). The temperature became much steadier. I hypothesize that this could be either A) a sag in the voltage due to drawing too much current or B) some sort of EMI can crosstalk. I will now attempt to test those hypotheses.

DS1Z_QuickPrint7

RobertLRead commented 3 months ago

By calling "display.display" mulitple times, w can definitely create significant noise on the A0 line:

} else { display.clearDisplay(); display.display(); display.clearDisplay(); display.display(); display.clearDisplay(); display.display(); }

DS1Z_QuickPrint8

RobertLRead commented 3 months ago

I sent this email to the team:

Dear Team, I have been working though a number of problems with the help of Lee. I have created a new version, which addresses significant problems, and makes some decisions which are merely taste. https://github.com/PubInv/moonrat/blob/main/moonratII/firmware_moonratII/production_R4/MoonRatRob6/MoonRatRob6.ino

I will describe this thoroughly in issues, the README, and internal documentation within the code.  But I wanted to explain this all in one place, although there are multiple overlapping problems.

  1. The code I got assumes we have a TMP37; we have a TMP36. I don't know how this ever code have worked, as the math for the computation of temperature is quite different.
  2. It is DEFINITELY the case that using the I2C bus to communicate to the OLED causes the A0 voltage to have noise. I believe (with high probability) that this can be eliminated by just not updating the OLED near the time or reading. Whether this is because of a voltage sag or electromagnetic interference, I don't know. I have change the code to make it more stable; it would have looked to the team in Mexico like random noise. I have not completely solved this problem (I have currently "unstacked" our PCB assembly shield from the R4), but in this confiiguration it is better.
  3. It is DEFINITELY the case that a cell phone can cause DRASTIC changes to the A0 voltage when near the board. This is not necessarily a problem that prevents it being used, but is something we need to be aware of.

I have reorganized Horacio's code a bit, in ways that I think are simpler. I now must:

  1. Test with the system "stacked".
  2. Test with a 12V input
  3. Test actual heat production 4My main goal now is to have a system to take to Tanzania. However, we have much more work to do before we can "release" this design. I don't believe we can invite people to use it until we have solved all of these EMI problems. That may require switching to a digital thermometer, which is what I used in the Moonrat which is in Ecuador---I did not know it would be such a problem here.
RobertLRead commented 3 months ago

When I stack the board on top of the R4, the A0 signal becomes ridiculously populated with high frequency noise. Although this code in theory be smoothed with a capacitors, I do not find it hopeful---I would rather eliminate the noise than smooth it. I will compare this with an R3 after this. DS1Z_QuickPrint10 DS1Z_QuickPrint11

DS1Z_QuickPrint9

RobertLRead commented 3 months ago

Initial evaluation suggests the Arduino R3 does not suffer from the same instability in the voltage on A0.

ForrestErickson commented 3 months ago

@RobertLRead regarding, " Test actual heat production"

You will need to remove the 100 Ohm resistor in series with the heating pad in the unit. That was put there so that I could, in principle, test everything off of the USB supply at low current (low heating).
image

ForrestErickson commented 3 months ago

@RobertLRead Regarding, "When I stack the board on top of the R4, the A0 signal becomes ridiculously populated with high frequency noise." As time permits,

  1. Please repeat this test with only a BLINK sketch. NO setup of any pin except the builtinLED. Characterized the interference on A0.
  2. Then make a sketch with blinks the LED and reads A0 once a second, as the LED blinks. Characterized the interference on A0.
  3. Then make a sketch with blinks the LED and reads A0 and reports the value of A0 out the serial port once a second, as the LED blinks. Characterized the interference on A0.

Characterize with oscilloscope waveforms as above.