MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.28k stars 19.24k forks source link

Pid stability ReArn #7585

Closed kfazz closed 7 years ago

kfazz commented 7 years ago

I'm seeing periodic oscillations and occasional low readings on Rearm, but only once I start a print. running the heaters and jogging the y axis also reproduces the issue, but not as bad. i've recalibrated my pid coefficients several times, and i get several sets of numbers that are fairly consistent. I'm having this issue both on my personal fork and latest bugfix-2.0.x commit 8419b0f22.. First pic shows extruder heating up, second shows probe and beginning of a 20mm cube print, third shows a 10 minute segment of the middle of the print. Heater is a 40W heater, thermistor is a E3d semitec cartridge thermistor. Machine is a MK2 clone.

Config.zip

heater_1 heaters_2 heaters_3

p3p commented 7 years ago

Wonder if its as simple as its trying to compensate for those bad temperature readings, seems to settle down after a while without one. As to why there is erroneous temperature readings .. we may need multi-sampling or sample rejection to compensate for the noise. Although over those timescales I'm not sure how likely that is.

kfazz commented 7 years ago

i believe there already is some sample averageing going on. the analog2temp* functions get called with 16 summed raw samples i think, and temp lookup tables are multiplied by 16. i'm currently playing around with a hacked up temperature.cpp using smoothieware's steinhart-hart calculation and all 12bits of the ADC resolution. https://pastebin.com/ciGAkWfS

kfazz commented 7 years ago

It appears that the bad values are almost always 4095, occasionally 4076 or similar.

alex-gh commented 7 years ago

Are you connecting the thermistors on the RAMPS, or directly to the Re-ARM?

alex-gh commented 7 years ago

It appears that the bad values are almost always 4095, occasionally 4076 or similar.

This is interesting. I just stumbled upon this errata sheet: http://www.nxp.com/docs/en/errata/ES_LPC176X.pdf

Quote from section 3.3:

Problem: Noise caused by I/O switching activity on pins close to the ADC input channels or caused by the board design/layout can couple into the ADC input channels. This causes the ADC conversion results to be corrupted up to 0xFFF. The issue occurs more frequently at -45 C and when toggling the I/O pins adjacent to the ADC input channels.

kfazz commented 7 years ago

My thermistors are connected through the RAMPS board. Do i have to mess with the solder jumpers on the ReArm to use the external headers? i'm rejecting all samples over 3950. getting better results, but still not perfect. https://github.com/kfazz/Marlin/tree/therm_hacks reject_pidtune reject_in_print

alex-gh commented 7 years ago

Yes, you would need to mess with the solder jumpers on re-arm to use it's thermistor headers.

Here is something else you could try if you have the time and the means to:

Try setting all unused ADC pins to digital outputs or ground them using a jumper. The chip has 8 ADC pins in total: P0.2 P0.3 P0.23 P0.24 P0.25 P0.26 P1.30 P1.31

The P0.23 P0.24 P0.25 are the three thermistor pins. Oh and if you decide to ground them make sure none of them are set as output.

Bob-the-Kuhn commented 7 years ago

I did some playing with the Re-ARM analog inputs a while back. The T0, T1 & T2 connectors are definitely the best behaved. Switching to the T0, T1 & T2 pins on the Re-ARM board helped but I don't remember how much.

It appeared to me that we were getting about 8 bits of useful resolution on those three pins on the RAMPS board. The other analog inputs were much worse (to be expected with no hardware filtering and no analog design considerations on the RAMPS board).

Yes, each analog reading consists of 16 samples which are added together and then compared to the thermistor table.

Try attaching a BIG cap to your analog input as close to the board as possible. The Re-ARM has 10uF caps on it. Try 100uF or larger caps.

Unused analog inputs are setup as general purpose I/O pins and, within the chip, are disconnected from the I/O pad.

JerryDurand commented 7 years ago

You might also want to add some ceramic capacitors across those 10uF caps. Somewhere in the 0.01 to 0.1uF range. The problem with large capacitors is internal resistance (ceramic is very low) and the dissipation factor (it's hard to get them to stay at a voltage they are charged to). I generally like to drive analog inputs directly from an op-amp, AD inputs generally have a lot of charge injection that can cause all sorts of grief.

Also, instead of averaging 16 samples, you might also look into delta-sigma filtering. It does require a divide but with careful selection of the fraction needed you can just use a shift.

On 08/30/2017 10:32 PM, Bob-the-Kuhn wrote:

I did some playing with the Re-ARM analog inputs a while back. The T0, T1 & T2 connectors are definitely the best behaved. Switching to the T0, T1 & T2 pins on the Re-ARM board helped but I don't remember how much.

It appeared to me that we were getting about 8 bits of useful resolution on those three pins on the RAMPS board. The other analog inputs were much worse (to be expected with no hardware filtering and no analog design considerations on the RAMPS board).

Yes, each analog reading consists of 16 samples which are added together and then compared to the thermistor table.

Try attaching a BIG cap to your analog input as close to the board as possible. The Re-ARM has 10uF caps on it. Try 100uF or larger caps.

Unused analog inputs are setup as general purpose I/O pins and, within the chip, are disconnected from the I/O pad.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MarlinFirmware/Marlin/issues/7585#issuecomment-326194078, or mute the thread https://github.com/notifications/unsubscribe-auth/AHfZY2gCqktm4jmpfAiFXe3lc2PB1e-Jks5sdkVcgaJpZM4PHUic.

-- Jerry Durand, Durand Interstellar, Inc. www.interstellar.com tel: +1 408 356-3886 @DurandInterstel

alex-gh commented 7 years ago

I can't do my own tests right now, but this issue goes way beyond the usual noise problem. You don't just lose a few bits of resolution to noise, it causes complete corruption of the conversion result.

NXP wouldn't bother putting it in a errata sheet otherwise.

It is a well known issue with the LPC176X ADC and is mentioned a lot online.

Tying the unused pins to a known potential either physically, or by setting them as outputs might reduce the amount of interference that can couple into the ADC and cause issues.

But like I said, I have no means of testing it right now.

EDIT: Here is an example that shows how grounding the other ADC pins can improve performance.

Bob-the-Kuhn commented 7 years ago

@JerryDurand - if you can point me to an algorithm I'll see what I can do.

p3p commented 7 years ago

Cleaned up my readings considerably by adding a lowpass filter into the adc read code, it is a workaround, though I'm not sure we can clean up the readings without something similar.

HAL.cpp

uint8_t active_adc = 0;
void HAL_adc_start_conversion(uint8_t adc_pin) {
  if(  (adc_pin >= NUM_ANALOG_INPUTS) || (adc_pin_map[adc_pin].port == 0xFF) ) {
    usb_serial.printf("HAL: HAL_adc_start_conversion: no pinmap for %d\n",adc_pin);
    return;
  }
  active_adc = adc_pin;
  LPC_ADC->ADCR &= ~0xFF;                                // Reset
  LPC_ADC->ADCR |= ( 0x01 << adc_pin_map[adc_pin].adc ); // Select Channel
  LPC_ADC->ADCR |= ( 0x01 << 24 );                       // start conversion
}

bool HAL_adc_finished(void) {
  return LPC_ADC->ADGDR & ADC_DONE;
}

uint16_t HAL_adc_lowpass_filter(uint16_t value) {
  const uint8_t k_data_shift = 6;
  static uint32_t data_delay[NUM_ANALOG_INPUTS] = {0};
  uint32_t &active_filter = data_delay[active_adc];
  active_filter = active_filter - (active_filter >> k_data_shift) + value;
  return active_filter >> k_data_shift;
}

uint16_t HAL_adc_get_result(void) {
  uint32_t data = LPC_ADC->ADGDR;
  LPC_ADC->ADCR &= ~(1 << 24); //stop conversion
  if ( data & ADC_OVERRUN ) return 0;
  return (((HAL_adc_lowpass_filter(data) >> 6) & 0x3ff)); //10bit
}
kfazz commented 7 years ago

P0.2 RXD0 Debug Uart Input P0.3 TXD0 Debug Uart P0.23 TEMP0 P0.24 TEMP_BED P0.25 TEMP1 P0.26 pin63 unused by default input P1.30 Beeper pin P1.31 SD_Detect *input i'll try setting these inputs as low outputs and see if it has any effect. so far i tried twisting my theristor wires and routing them away from my PSU and it made no difference. if i remember to pick up a fine point soldering iron later i'll try using t0 and t1 instead of the ramps plugs. What about a software algorithm that takes the derivative of the temperature into account? a sample could be rejected if it represents an impossibly fast cooldown, with a counter to preserve mintemp protection.

p3p commented 7 years ago

The lowpass filter will work well for that, or any high frequency change. I'm testing with a median filter as well to throw away obviously bad values.

p3p commented 7 years ago

Well this is my results through software filtering, not sure if we want to use them but it's definitely an improvement. First image is lowpass k=4 , then the lowpass filter at k=8 no_filter_lowpass

This image uses a lowpass k = 4, and a 3 value median filter. (about 1 degree deviation in the hotend) lowpass_median

changes to example code above:

struct MedianFilter {
  uint16_t values[3];
  uint8_t next_val;
  MedianFilter() {
    next_val = 0;
    values[0] = values[1] = values[2] = 0;
  }
  uint16_t update(uint16_t value) {
    values[next_val++] = value;
    next_val = next_val % 3;
    return max(min(values[0], values[1]), min(max(values[0], values[1]), values[2]));
  }
};

uint16_t HAL_adc_lowpass_filter(uint16_t value) {
  const uint8_t k_data_shift = 4;
  static uint32_t data_delay[NUM_ANALOG_INPUTS] = {0};
  static MedianFilter median_filter[NUM_ANALOG_INPUTS];
  value = median_filter[active_adc].update(value);
  uint32_t &active_filter = data_delay[active_adc];
  active_filter = active_filter - (active_filter >> k_data_shift) + value;
  return active_filter >> k_data_shift;
}
kfazz commented 7 years ago

Using sample rejection and setting as output low the ununsed inputs really cleaned up my readings for the bed, they're almost perfect.... until i start printing. i've noticed the periodic oscillations seem to be caused my voltage droop from the bed being turned off and on. I'm taking a shotgun approach and trying everything at once. So far: reject adc readings over 3950 ferrite choke on usb wire. ununsed ADC pins set to output low twisted thermistor wires and routed away from psu wires. Raising PWM freq by messing with SOFT_PWM_SCALE (this seems to help a lot). trying different ADC sampling frequencies.

Just going visually here, it looks like the PWM output isn't recalculated often enough once the print begins, and it develops a mostly on - off pattern, which causes voltage swings on my marginal converted ATX power supply, that screw up the temp readings. shotgun_approach more_failure

still to try: disable PROBING_HEATERS_OFF (seems to destabilize the PID loop) try Rearm T* pins see if t2 is less noisy than t0 try a different power supply change the cartridge thermistor in my e3dv6 (last resort) averaging filters

p3p commented 7 years ago

@kfazz have you tried adding my software filter to see if it helps?

kfazz commented 7 years ago

@p3p not yet. i'll give it a try tomorrow. Do you get similar results once you begin a print? i don't seem to get any erroneous ADC readings until the steppers are enabled.

p3p commented 7 years ago

The graphs I posted where during a test sequence which maximises stepper usage (pretty much just max moves on all axes over varying distances as fast as possible)

kfazz commented 7 years ago

Looks like the main cause of the issue was my ATX PSU. I didn't have a 5v load. In this pic i've switched it out for an Ablecom PWS-1k01-1R Server supply. Much better. better

p3p commented 7 years ago

That explains the voltage issue, you still have some ADC issues though badly effecting the PID loop, @Bob-the-Kuhn do you think adding the software filter would a good plan? may even make the other ADCs usable to some extent.

kfazz commented 7 years ago

This is using T0 and T1 without median filter, and all my other hacks. t0_before_median

+uint16_t HAL_adc_lowpass_filter(uint16_t value) {
+  const uint8_t k_data_shift = 4;
+  static uint32_t data_delay[NUM_ANALOG_INPUTS] = {0};
+  static MedianFilter median_filter[NUM_ANALOG_INPUTS];
+  value = median_filter[active_adc].update(value);
+  uint32_t &active_filter = data_delay[active_adc];
+  active_filter = active_filter - (active_filter >> k_data_shift) + value;
+  return active_filter >> k_data_shift;
+}
+
+uint16_t HAL_adc_get_result(void) {
+  uint32_t data = LPC_ADC->ADGDR;
+  LPC_ADC->ADCR &= ~(1 << 24); //stop conversion
+  if ( data & ADC_OVERRUN ) return 0;
+  if (data==4095) return 4095;
+  return (((HAL_adc_lowpass_filter(data) >> 4) & 0xfff)); //10bit
+}

problemsolved Looks like the median filter works very well.

Bergerac56 commented 7 years ago

@p3p I am also struggeling with the noise on T0 T1. I tried a lot of things to solve it.

Among others, what I tried:

Then I combined the "therm_hack" of kfazz (If you could point me to a more evolved version of the hack I could insert in the last bugfix, it would be greath) and the inductors. This improves really the things to a near perfect temp curve.

Bergerac56 commented 7 years ago

Here a picture showing the difference. Only a self-induction of 1.8mH. Left part is without the self and all steppers active. Right side is the same configuration but with the self. No software hack. Last bugfix firmware. temp curve

This is a test with the self and the software hack: temp curve 2

And the self. (I had only large-one as spare) self

p3p commented 7 years ago

What is the result like with just the software filter?, I don't need any extra hardware on mine unless I use the unfiltered ADC ports.

Bergerac56 commented 7 years ago

This was done with only the hack mentionned in the kfazz post above. temp curve 3

Do you have a more elaborated version of the hack I could try?

p3p commented 7 years ago

I just use what I wrote above (the code kfazz posted is incomplete but to run it you must have used my code from higher up the thread), I intend to add (a version of) it in as default. You shouldn't be able to get those spikes if the filter is in use, the median filter should remove any single read error, whilst the lowpass will smooth out any high frequency change. The k_data_shift variable controls how smooth, although increasing it causes a delay in response (~ 2^k ADC readings to converge).

HAL.cpp

uint8_t active_adc = 0;
void HAL_adc_start_conversion(uint8_t adc_pin) {
  if(  (adc_pin >= NUM_ANALOG_INPUTS) || (adc_pin_map[adc_pin].port == 0xFF) ) {
    usb_serial.printf("HAL: HAL_adc_start_conversion: no pinmap for %d\n",adc_pin);
    return;
  }
  active_adc = adc_pin;
  LPC_ADC->ADCR &= ~0xFF;                                // Reset
  LPC_ADC->ADCR |= ( 0x01 << adc_pin_map[adc_pin].adc ); // Select Channel
  LPC_ADC->ADCR |= ( 0x01 << 24 );                       // start conversion
}

bool HAL_adc_finished(void) {
  return LPC_ADC->ADGDR & ADC_DONE;
}

struct MedianFilter {
  uint16_t values[3];
  uint8_t next_val;
  MedianFilter() {
    next_val = 0;
    values[0] = values[1] = values[2] = 0;
  }
  uint16_t update(uint16_t value) {
    values[next_val++] = value;
    next_val = next_val % 3;
    return max(min(values[0], values[1]), min(max(values[0], values[1]), values[2]));
  }
};

uint16_t HAL_adc_lowpass_filter(uint16_t value) {
  const uint8_t k_data_shift = 4;
  static uint32_t data_delay[NUM_ANALOG_INPUTS] = {0};
  static MedianFilter median_filter[NUM_ANALOG_INPUTS];
  value = median_filter[active_adc].update(value);
  uint32_t &active_filter = data_delay[active_adc];
  active_filter = active_filter - (active_filter >> k_data_shift) + value;
  return active_filter >> k_data_shift;
}

uint16_t HAL_adc_get_result(void) {
  uint32_t data = LPC_ADC->ADGDR;
  LPC_ADC->ADCR &= ~(1 << 24); //stop conversion
  if ( data & ADC_OVERRUN ) return 0;
  return (((HAL_adc_lowpass_filter(data) >> 6) & 0x3ff)); //10bit
}
p3p commented 7 years ago

I missed a bit from the code listing when copying it down the thread, updated now.

alex-gh commented 7 years ago

@Bergerac56 Just to clarify, is the inductor 1.8 millihenry or 1.8 microhenry?

Bergerac56 commented 7 years ago

I just rechecked with my LCR meter (spare parts, no indications). It is well 1.8 millihenry.

alex-gh commented 7 years ago

Got it. I was thinking about putting a tiny SMD inductor onto the solder jumper on the Re-ARM, but the largest value I was able to find in a 0603 package was 100 microhenry.

Might still be enough.

p3p commented 7 years ago

@alex-gh @Bergerac56 Can you confirm that the code above isn't enough to clear up the adc readings completely for you?

alex-gh commented 7 years ago

@p3p Unfotunately I can't test it yet. I have the board here but nothing to connect to it. I was going to try and get a debug connector soldered onto it first.

Bergerac56 commented 7 years ago

@p3p I redid some tests for you. To be honnest, I made a very dirty hardware configuration with all wires mixed up to maximize the noise. Don't laugh ;) test hardware

Here are 3 runs: 1> Last bugfix no software changes and no inductor. 2> Modified HAL.cpp with your code and 3> Your code + inductor

temp curve 4

So, I believe that the good solution is your code + inductors (each thermistor has its inductor. Do not forget to have the 2 wires in the inductor-> see picture above.)

I will do a last test with basic firmware and inductor.

p3p commented 7 years ago

ahh ok you're testing extremes, you might want to put a large value for k (8?) to simulate a larger capacitor, although the median filter can't compensate if 2/3 of data is invalid. I wonder how well the atmega ADCs would work with that setup.

Bergerac56 commented 7 years ago

Here it is: plain vanilla last bugfix 2.0.x with and without inductor. Self explanatory, at leat for me... .Could be cool if somebody could reproduce this to be sure. (test conditions: RE-ARM => T0-T1-T2 on RE-ARM / messy hardware setup / all stepper motors actives (G1 X200 Y200 Z100 F30)) temp curve 5

Bergerac56 commented 7 years ago

@p3p It is late today. But, just for the fun, I will try this for you tomorrow (with an atmega) :)

Bergerac56 commented 7 years ago

@p3p I could not resist: Here the test with an atmega. Plain vanillia last bugfix compiled for arduino. No inductor... Perfect curve. Or the design of the re-arm is really bad or the processor has a problem with analog inputs or both...

temp curve 6

p3p commented 7 years ago

I expected it to be much better but wasn't expecting perfect.. they do have more robust ADCs and it runs at 5V which would help with noise a little .. ah well, in normal use I'm getting no noise at all on my Re-ARM as long as I use the software filter, I may have another look at how we initialise and use the ADCs but I don't think there is anything more we can do in software ..

alex-gh commented 7 years ago

@Bergerac56 Pretty sure the processor is to blame. Take a look at this errata sheet, section 3.3.

It appears to be highly sensitive to high frequency noise, with noise not just causing imprecision, but complete corruption of the measurement. This behavior is well documented on the forums that deal with LPC17XX/MBED.

p3p commented 7 years ago

@alex-gh yea that's what the median filter is cleaning up, but that falls apart if it has consecutive errors.

alex-gh commented 7 years ago

I'm a big proponent of fixing hardware bugs in hardware, thus my idea with soldering tiny inductors onto the Re-ARM jumpers, but I realize this is not a good solution for the majority of the userbase.

So I guess software filtering is the next best option.

Grogyan commented 7 years ago

I believe that we will be finding more "bugs" hat are related to hardware noise. due to the lower DC voltage from 5 to 3v3 which don't have shift level converters. Am also wondering if the issues we had/have with missing last line +1, and LCD issues, can be mitigated if boards come with chokes.

Such on-board chokes are quite normal practice in the electronics industry.

Bergerac56 commented 7 years ago

@p3p I agree that solving hardware with hardware is the best way to go. The inductors would have ideally to be foreseen on a new version of RAMPS card. In the meantime, your piece of code is very valuable and would have to be quickly added to one of the next bugfixes. Inductors + this code give a perfect curve (I made a test for more than 2 hours without any spike)

I would make the filtering code "optional" in configuration.h. (activation and k-factor). So people can choose. This is the way I implemented your solution in my version of bugfix (mainly to facilitate the tests and multiple compilations of yesterday ;) ).

I added in configuration.h: // @section extras // // Filtering of T0-T1-T2 to improve temp readings when using a Re-ARM // Not needed for atmega //

define TEMP_FILTERING_ACTIVE // Uncomment to activate the software filtering

if ENABLED(TEMP_FILTERING_ACTIVE)

define k_Factor 4 // k_Factor simulates the size of the filtering capacity.

endif

//

// Increase the FAN PWM frequency. Removes the PWM noise but increases heating in the FET/Arduino //#define FAST_PWM_FAN

And here is the file hal.cpp HAL.zip

I hope I did not make mistakes ;)

github-actions[bot] commented 3 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.