MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.17k stars 19.21k forks source link

Trouble report from Antclabs (The BL-Touch people) #6569

Closed Roxy-3D closed 6 years ago

Roxy-3D commented 7 years ago

We should probably do a quick check of M48, M401 and M402. Antclabs downloaded the current software because they wanted to check out the heaters being turned off during probing. And they ran into trouble before they even got that far.

Grogyan commented 7 years ago

Can you please elaborate? What trouble are they having?

Roxy-3D commented 7 years ago

I got an email from them saying things were not working correctly for them. They may have caught a point in time when things were broken. I'm not sure. I guess we will know more very soon.

bgort commented 7 years ago

FYI, the new BLTouch you sent (thank you!) is giving me 'worse' repeatability than the 'damaged' one. I have another one coming this morning (3-4 hours, probably), so will know if there's something wrong in Marlin, or if it's just a fluke 'bad' one.

Will check my Z with a micron dial indicator shortly, too, but don't believe there's an issue there.

Roxy-3D commented 7 years ago

Wow! OK! That is good to know! It will be interesting to see what the next one says!

bgort commented 7 years ago

No meaningful error in the Z axis.. ~1um difference from starting zero after 30 vertical, back-and-forth travels.

This is strange: I'm not having any trouble this morning with the BLTouch you sent when the bed is stone cold - mostly 2-5um stddev with <10um range. As the bed heats up, however, M48 stddev increases up to what I was seeing yesterday -- 20-30um; this is with BLTOUCH_HEATERS_OFF enabled. I can't imagine the bed is changing shape that rapidly, so not sure what's going on yet. Watching the dial indicator I can see the significantly increased error where it's starting/stopping the M48 up-down probing motion, so assuming the axis is actually stopping when it sees the endstop, this is confirming what Marlin is reporting.

Will try disabling endstop interrupts to see if that yields different results.

Roxy-3D commented 7 years ago

I can see the significantly increased error where it's starting/stopping the M48 up-down probing motion, so assuming the axis is actually stopping when it sees the endstop, this is confirming what Marlin is reporting.

In other words... Are you thinking Marlin is stopping the stepper motors too fast? I'm not understanding what you are trying to communicate.

bgort commented 7 years ago

With that, I was just saying that I'm seeing the same increased error on the dial indicator, assuming Marlin is stopping the steppers immediately when it senses the endstop trigger.

In other words, when the bed is cold and everything is working as expected (2-5um repeatability), the axis stops at roughly the same top and bottom points according to the dial indicator. When I'm seeing increased error, Marlin is stopping the axis higher or lower according to the dial indicator.

Roxy-3D commented 7 years ago

The Z-Axis travels further when the machine warms up? But even if that was true... shouldn't the sampled point already be locked in and its position known?

bgort commented 7 years ago

Yes. There's more variability on both the upward and downward moves when it's warm (warming?).

But even if that was true... shouldn't the sampled point already be locked in and its position known?

Yes, unless it's detecting arrival at the endstop late, so it travels further before stopping?

Right now, I'm wondering if it's just that extra cycles are being taken up by something heater-related and that's interfering with the endstop interrupt?

Trying to sort out the problem in #6571, and will then come back to this.

Roxy-3D commented 7 years ago

@bgort Can you turn EndStop Interrupts on for your hardware? That might provide a valuable clue.

bgort commented 7 years ago

It's already on. I'm going to turn it -off- and try that once I've resolved whatever is going on in #6571.

bgort commented 7 years ago

Earlier I got this and numerous results just like it, multiple times, with endstop interrupts disabled and the bed off and cold:

Range: 0.006 Standard Deviation: 0.001972

On that run, I was watching the dial indicator and the axis stopped at nearly the exact same position up and down, every time (+/- ~1-2um, as implied by the stddev).

And now, after letting the bed preheat for ~30 minutes, endstop interrupts disabled, BLTOUCH_HEATERS_OFF enabled, I'm getting this (and slightly better and slightly worse):

Range: 0.056 Standard Deviation: 0.018168

Here's a video showing this 18um stddev M48: https://youtu.be/0JVLKOxblUo

Positions of each successive probing motion:

So you can see that the first 5 moves are basically perfect, and then some progressive-error-inducing issue begins? Given that, and the numerous nearly-perfect-when-ice-cold M48s, the probe itself is clearly good, and capable of good accuracy. So does this mean the problem is with Marlin? Or is it stray EMF? Obviously if the ending position is w/in 1-2um of starting, there's no major mechanical issue in the axis itself.

Another question: Why is a 10 probe M48 doing 11 probes? I haven't looked at the M48 code recently, but...

Roxy-3D commented 7 years ago

There is another possibility... Is it possible the BL-Touch is heat sensitive? What happens if you start with a cold machine (where you get the best repeatibility).... But you take a hair dryer after the BL-Touch? I know it is a stretch... But something simple is causing it to not repeat.

Another question: Why is a 10 probe M48 doing 11 probes? I haven't looked at the M48 code recently, but...

M48 needs to find the bed. Then it does the 10 probes... Yes... that should be changed!

bgort commented 7 years ago

If it were a temperature issue I'd think it would have affected the first 5 probes, too, but I'll happily try the hair dryer idea. Letting it cool off now. (Fun: blowing on the surface of the hot bed changes the shape by 4-5um, then it returns to where it was when the temp stabilizes after a few minutes.)

Maybe I'll forward this to Issue to Paris and see what she thinks; or is she already aware of it? Has she mentioned anything about the problem they had yet?

I'll take a look at M48 here shortly.

Roxy-3D commented 7 years ago

Maybe I'll forward this to Issue to Paris and see what she thinks; or is she already aware of it? Has she mentioned anything about the problem they had yet?

I have to run out the door.... But when I get back I'll try to remember to send you an update via email.

bgort commented 7 years ago

Great!

lrpirlet commented 7 years ago

@bgort

So you can see that the first 5 moves are basically perfect, and then some progressive-error-inducing issue begins?

FYI, I have a mechanical switch (just a SMD switch) mounted on a servo as a probe... This probe is inserted between the bed and the tip... Because the heat would destroy the switch, I always measured with a cold tip, with a cold bed...

I have seen that progressive error from day one when using m48... (note that I used it with 12 iterations so that the sampling size meet the stats requirement)... I attributed this behaviour to mechanical imperfection in the Z axis... In other words, the end position of a particular sample is NOT independent of the previous (with my setup prusa i3)... BUT using a matrix of 12 set of measure each made of 12 probe, I could observe the behaviour to be consistent... a bit of oil on the Z screw seems to make a difference, but I got annoyed to fix the printer rather than print with it and I stopped investigating... (the next step would have been to introduce a random Z move between each probe)

I concluded that to take several measures and use the average would NOT bring any improvement over the measure... I also concluded that m48 result should be taken with quite a bit of salt... (if it prints ok, do not try to repair it)

Roxy-3D commented 7 years ago

I also concluded that m48 result should be taken with quite a bit of salt... (if it prints ok, do not try to repair it)

It kind of feels to me like the probe is reporting the wrong numbers. Unless the Z axis is ending up in the wrong location. Right now... Any speculation about possible causes is helpful.

If I had to bet... M48 is doing the right thing with the data it gets. I would say the problem is at a lower level.

I just had an idea. What if we tape a piece of tape around the Z-Axis threads so we can see exactly where it is positioned. I wonder if we see the first 5 points of the M48 always return to exactly the same place. But then when the bad numbers start happening, the tape is pointing off in another direction????

I have a hunch putting a piece of tape on those threads would tell us a lot.....

bgort commented 7 years ago

I concluded that to take several measures and use the average would NOT bring any improvement over the measure... I also concluded that m48 result should be taken with quite a bit of salt... (if it prints ok, do not try to repair it)

I appreciate the information, and generally agree with your approach (if it isn't broken, no need to fix it), but in this case I'm occasionally seeing quite significant (sometimes 30+um in either direction) differences in nozzle height after G28 that shouldn't be present (due to whatever this error is, I suspect), so the problem isn't just with M48. I'm trying to get to the root of whatever is going on.

bgort commented 7 years ago

It kind of feels to me like the probe is reporting the wrong numbers. Unless the Z axis is ending up in the wrong location. Right now... Any speculation about possible causes is helpful.

Which probe do you mean here? The BLTouch or the micron dial indicator?

I just had an idea. What if we tape a piece of tape around the Z-Axis threads so we can see exactly where it is positioned. I wonder if we see the first 5 points of the M48 always return to exactly the same place. But then when the bad numbers start happening, the tape is pointing off in another direction???? I have a hunch putting a piece of tape on those threads would tell us a lot.....

The dial indicator is giving us this information with accuracy and precision, no? Or are you saying it isn't accurate? If it is accurate (and I believe it is), tape isn't going to give us anything we don't already have: the axis is stopping at different z points during the probe moves.

bgort commented 7 years ago

Just now - Heaters off, bed ice cold, UBL deactivated: M48 - Range: 0.007 Standard Deviation: 0.002850 (this is confirmed by my eyeballing)

Then zeroing my dial indicator at Z6 and sequentially moving to the below Z points (@ F100):

And then back to Z6 takes me to -0.002, so ~2um deviation from the original zero (which is fine as far as repeatability goes). Note that the ~30um error starts between Z6 and Z5, and then propagates through the travels to the other points, which all seem to stay pretty close to 1mm apart, as they should. Z7 gives me -1.010, so there's a little error in the other direction.

So.. I think I may have a problem in my leadscrews or nuts (around the Z6 area), because I can't imagine Marlin would step the same distances differently; UBL is deactivated, so there should be no fading...?

It doesn't seem likely I'm missing steps, because that would be random-ish and I more-or-less return to 0 at Z6.

Ideas / thoughts?

(EDIT: You know, it's entirely possible it is Marlin, actually, as I did this test a while back [~6 mon, ago?] and I don't recall seeing errors like this. Suppose I can flash to an older version to see what I see. Or maybe the screws have worn? Can anyone else try this and see what happens for them?)

Roxy-3D commented 7 years ago

(EDIT: You know, it's entirely possible it is Marlin, actually, as I did this test a while back [~6 mon, ago?] and I don't recall seeing errors like this. Suppose I can flash to an older version to see what I see. Or maybe the screws have worn? Can anyone else try this and see what happens for them?)

Is it only with UBL turned on we see this? If no bed leveling is turned on, does it still happen? M48 and G28 use the standard probing calls. So I kind of doubt it has anything to do with UBL.

The dial indicator is giving us this information with accuracy and precision, no? Or are you saying it isn't accurate? If it is accurate (and I believe it is), tape isn't going to give us anything we don't already have: the axis is stopping at different z points during the probe moves.

Yes. I agree. I don't have a dial indicator or anyway to get one on the machine. For me... Maybe the piece of tape idea could work well enough I can help.

bgort commented 7 years ago

Is it only with UBL turned on we see this? If no bed leveling is turned on, does it still happen? M48 and G28 use the standard probing calls. So I kind of doubt it has anything to do with UBL.

UBL is deactivated for all of this, and yeah it doesn't seem likely, but was a consideration.

Yes. I agree. I don't have a dial indicator or anyway to get one on the machine. For me... Maybe the piece of tape idea could work well enough I can help.

Yeah, tape could give a good idea where the axis is stopping, though we're talking about a few microns difference so you'd likely need a way to reliably measure a pretty small angle (unless your lead&starts -> steps-per-mm is relatively high).

Roxy-3D commented 7 years ago

The Z-Threads are very 'not' fine pitch on the machine with a BL-Touch probe. So it is the opposite of what you say I need. BUT... Maybe that is actually a good thing? If Marlin is somehow losing steps or losing track of where it saw an endstop... It would be easier to see the tape change where it is pointing.

I'll get setup and see if I can see the failure...

bgort commented 7 years ago

Ah, thanks -- let me know what you see. Will be trying an older version of Marlin here shortly.

bgort commented 7 years ago

I just tried another M48 after letting the printer sit for a while, and got this:

Send: M48
Recv: M48 Z-Probe Repeatability Test
Recv: Error:STOP called because of BLTouch error - restart with M999

then M48 continued on, and then octoprint disconnected...

Will go looking for this in a bit, after I get my semi-permanent ICSP and JTAG ribbon cables installed.

bgort commented 7 years ago

And hmm, I just tried M999, but it didn't reset anything as far as I can tell (even my stored feedrate remained):

Send: M999
Recv: Resend: 1
Recv: ok
Recv: ok
Send: G1 Z10 <<moved at F100, which is not the default feedrate>>
bgort commented 7 years ago

I found one of the sources of my problem. Watch this:

https://www.youtube.com/watch?v=ncPK_ar_gnU (be sure CC is on -- I added subtitles/captions) (.. and ignore the cracks in the face of my dial indicator..I dropped something on it..annoying. Still works fine, but can't be repaired.)

So at the start, the bed is stable at ~45C (my PLA print temperature) and has been for ~20 minutes. What I found is that in between the heater on/off cycles, the bed shape is changing relatively dramatically - up to ~35um.... certainly enough to account for the issue I'm seeing with repeatability.

I'm going to try bed PWM and see if that resolves part of the problem.

bgort commented 7 years ago

So far, using PID/PWM mode for the bed heater is yielding a relatively solid (+/- ~2um) bed shape, generally. Obviously I should have been using that since the beginning, but didn't think it was necessary so never bothered...

Latest bugfix-1.1.x PROBING_HEATERS_OFF disabled: Recv: Standard Deviation: 0.003931 Recv: Standard Deviation: 0.004116 Recv: Standard Deviation: 0.010869 Recv: Standard Deviation: 0.022630 Recv: Standard Deviation: 0.009321 Recv: Standard Deviation: 0.010533 Recv: Standard Deviation: 0.013588 Recv: Standard Deviation: 0.010840

PROBING_HEATERS_OFF enabled: Recv: Standard Deviation: 0.010701 Recv: Standard Deviation: 0.012945 Recv: Standard Deviation: 0.016597 Recv: Standard Deviation: 0.019673 Recv: Standard Deviation: 0.012543 Recv: Standard Deviation: 0.010251 Recv: Standard Deviation: 0.011454 Recv: Standard Deviation: 0.007762

Slightly better with PROBING_HEATERS_OFF disabled, though I think most of the difference is just due to (now minor) heating-cooling-related fluctuations in bed shape.

I still think there's another issue somewhere, but this is close enough now for my purposes - mostly.

dasflux commented 6 years ago

Wow this sounds like what was going on with my classic. Using that firmware. Im using a tarantula with mksbase 1.4(im supplying 5.05v via a regulator so you know its not undervoltage). Did you get it figured out? G29 would throw mine off sometimes. A cold machine. I ordered a Smart. It is doing it. But on a much smaller scale. I would say my offset is changing .02-.03. I can't nip down when it will happen. But after the machine has been off a while(and yes im doing m500 of course). I know this is an old thread. Did you guys figure it out? I

Roxy-3D commented 6 years ago

One thing that changed is we added the option to turn off the heaters and fans during the actual probe event. Holding the voltage to the BL-Touch more stable and eliminating the high switching current on the wires near the BL-Touch wires helps some people. For other people, there is no noticeable improvement.

You should try turning off the electrically noisy stuff during the probe. It might clean up your problem:

/**
 * Enable one or more of the following if probing seems unreliable.
 * Heaters and/or fans can be disabled during probing to minimize electrical
 * noise. A delay can also be added to allow noise and vibration to settle.
 * These options are most useful for the BLTouch probe, but may also improve
 * readings with inductive probes and piezo sensors.
 */
//#define PROBING_HEATERS_OFF       // Turn heaters off when probing
//#define PROBING_FANS_OFF          // Turn fans off when probing
//#define DELAY_BEFORE_PROBING 200  // (ms) To prevent vibrations from triggering piezo sensors
Grogyan commented 6 years ago

FWIW I have a 1.5A LM317 providing 5V power to my board and probe.

With the Smart probe version, it is a lot more power stable than old versions

Roxy-3D commented 6 years ago

Yeah... But if the BL-Touch wires run close to the heater or fan wires... They can pick up electrical noise from the high current switching on and off.... I don't know if it will help you or not. But it is easy to turn on and see.

Grogyan commented 6 years ago

I have a friend who had electrical noise interference due to proximity to the motor or fan. It was strange, as I didn't have the problem. /shrugs

dasflux commented 6 years ago

Thanks! I will try that. The smart has been great. I will definitely try it what you recommended.

These two are my first BL touch. Do you think the classic is fine? Its just weird, i randomly have to re calibrate it. I can't pin point when it was happening. But you are comfirming that this has been an issue right? Nobody could help me. I asked everywhere then just gave up and ordered another. None the less thanks!

Grogyan commented 6 years ago

The magnetic flux in the pin isn't quite right, it is a known issue. Please ensure that you have the Smart version, old versions of the probe should be discarded. Also ensure that you get the probe from an authorized seller, counterfeit probes are also known to have issues

Roxy-3D commented 6 years ago

Also... some people are side stepping the issue. I have UBL running on all my printers (big surprise!). I have the mesh perfectly dialed in with G26. The only thing my printer needs to do with the BL-Touch is home the Z-Axis and it is ready to print.

But here is the thing: I deliberately have it it home .15mm above the bed. I have the DOUBLECLICK_FOR_Z_BABYSTEPPING turned on. If it turns out the probe returns a bad value, it is never off more than .1mm. And I just use the Z-BabyStepping to get the nozzle height perfect while the printer is drawing the skirt around the object.

To build a mesh, the BL-Touch is plenty reliable. Even with a 15 x 15 mesh (225 mesh points), you will only have a few bad values generated by the probe. And the G26 Mesh Validation Pattern will show you exactly where those locations are. You just edit those mesh points to correct them. And then it doesn't matter. At that point, the problem shifts to getting the printer to home correctly. And I side step that by doing the DOUBLECLICK_FOR_Z_BABYSTEPPING when I print.

dasflux commented 6 years ago

You guys are freaking awesome. I realllllly appreciate it. I was pulling my hair out over that one. I couldn't find "some people" to help me. Nobody on google plus or reddit knew anything about it. Thanks! Its nice to know its still usable for experimental machines. I would rather do that than bed level any day of the week.

dasflux commented 6 years ago

Well, this is probably not relative as the problem to the threat because this even happened even on an alternate power source and was a bl touch classic.

I had a wire catch on fire going to the BLtouch. This happened after a cold boot during the bed heating up. The irony is i was JUST about to change all this, but because of size i couldnt print my custom mks base case on my MP mini. The weird thing... it wasnt a short and didnt melt the connector. It was the ground wire going splitting off of the main power cable. SO, lol, I got a good one for you guys to scratch your head at.

Would you guys take a look? Yall are a bit above most of the community as nobody else could answer my bltouch questions. I took a video and made a quick illustrated diagram.

Diagram of wiring. Where it burned i marked it with 'burn' in small letters using epic MS paint skills(lol). https://postimg.org/image/5fgnv1rziz/

Video (audio isnt so great) https://www.youtube.com/watch?v=wzqJP1HoJKM

Buck convertor (maybe it feed it too many amps?) https://www.amazon.com/gp/product/B014Y3OT6Y/ref=oh_aui_detailpage_o08_s00?ie=UTF8&psc=1

Roxy-3D commented 6 years ago

I wonder if that voltage regulator circuit board touched some metal? The BL-Touch doesn't take enough current to melt that wire. It does take 350 ma when it kicks the pin. But that pulse only lasts 1/10 of a second (if that).

dasflux commented 6 years ago

Nope. It was in a printed case so the trimmer didnt get bumbed. I just did not show it in the video.

On Wed, Oct 11, 2017 at 9:42 PM, Roxy-3D notifications@github.com wrote:

I wonder if that voltage regulator circuit board touched some metal? The BL-Touch doesn't take enough current to melt that wire. It does take 350 ma when it kicks the pin. But that pulse only lasts 1/10 of a second (if that).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MarlinFirmware/Marlin/issues/6569#issuecomment-335995705, or mute the thread https://github.com/notifications/unsubscribe-auth/AXow0a9ac5DtnxHJt7UYjxcShF90sY9eks5srW54gaJpZM4NQPpp .

github-actions[bot] commented 2 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.