Klipper3d / klipper

Klipper is a 3d-printer firmware
GNU General Public License v3.0
9.34k stars 5.28k forks source link

BLTouch 3.1 #1938

Closed CappyT closed 4 years ago

CappyT commented 5 years ago

I'm having issues with BLTouch "Smart" v.3.1. After removing the capacitor on my Z_Endstop (i have a melzi board and AntClabs suggest to remove it) the frequency of bltouch going bad increased dramatically, so much that now it is impossible to complete a 7x7 bed_mesh.

Before the capacitor removal, the bltouch goes bad 1 time in around 1000 probes After the capacitor removal, the bltouch goes bad 1 time in 25 probes

I think this issue is linked to this one: https://github.com/KevinOConnor/klipper/issues/1882

So I hope this will be fixed soon, as right now there's no way i can use my BLTouch.

Thanks for the help. Attached my printer.cfg (in a zip file) and my klippy.log

klippy.log printer.zip

jlirochon commented 5 years ago

Hi,

Is your printer an Ender3 or something else ?

For the Ender3 I suggest this post. I tested on mine and my BLTouch 3.1 works so far, although I didn't test with 1000 probes.

CappyT commented 5 years ago

@jlirochon Yes, my printer is an ender 3. I will try today if it works.

Coffee0297 commented 5 years ago

i might add that marlin holds pwm on the probe at all times. and klipper do not. I think that a part of the problem. i had the same issue with mine 3.1, but with marlin it works no problem. i too want this fix so i can use klipper again.

CappyT commented 5 years ago

i might add that marlin holds pwm on the probe at all times. and klipper do not. I think that a part of the problem. i had the same issue with mine 3.1, but with marlin it works no problem. i too want this fix so i can use klipper again.

Have you tried klipper with bltouch on icsp also?

mental405 commented 5 years ago

holds pwm on the probe at all times.

I am not sure what this means.

tiaanv commented 5 years ago

I have the same issue. Same setup. Marlin works 100%, but on klipper probing fails at a high rate. One thing I clearly noticed.... is in with Marlin when the BLTOUCH does a probe, the moment it triggers it jumps to the up position... With klipper this does not happen. Not sure if it's just a symptom or an actual underlying issue.

CappyT commented 5 years ago

I have the same issue. Same setup. Marlin works 100%, but on klipper probing fails at a high rate. One thing I clearly noticed.... is in with Marlin when the BLTOUCH does a probe, the moment it triggers it jumps to the up position... With klipper this does not happen. Not sure if it's just a symptom or an actual underlying issue.

Noticed that also. With Klipper, the bltouch touches the plate, Z stops, but after a second/second and half retracts the pin. I think that can be related. Can you attach your klipper logs? that could help.

Let's see what @KevinOConnor thinks about this. Also, if the dev doesn't have a BLTouch, i can borrow easily mine, or gift him one, so he can fix the problem easily.

@jlirochon Unfortunately even wiring the BLTouch that way doesn't fix the issue. Thanks anyway for the help.

tiaanv commented 5 years ago

@CappyT. I had to temporarily go back to Marlin to use the printer. I will play around again this weekend and attach some logs

@jlirochon , I tend to not think this is pin related, as the EXACT same hardware (SKR PRO 1.1) works consistently in Marlin for me.

jlirochon commented 5 years ago

@tiaanv

with Marlin when the BLTOUCH does a probe, the moment it triggers it jumps to the up position... With klipper this does not happen

Not sure I understand what you mean here.

If you have a reproducible case this is great. I think the following could help :

tiaanv commented 5 years ago

@jlirochon

Not sure I understand what you mean here.

It's simply my observation that the BLTOUCH behaves differently (when doing a probe action) in Klipper than in Marlin. That, and the fact that is doesn't work reliably in Klipper (yet).

As mentioned I will play around again with Klipper, and post as much detail as possible (including your request above). Busy doing some prints at the moment, so no playtime.

Coffee0297 commented 5 years ago

i have a new bltouch on the way, ill make some tests tomorow and see if i can get some debug stuff. heres a video of the problem: https://vimeo.com/353339123

if you look the blue light that indicates that the probe are getting pwm is only lit when the probe is getting told to do somthing. in marlin it allways gets a signal ie. the blue light is allways on. i think thats the main problem. i cant really tell from the code in marlin how they do it or how i can test it on klipper. if anyone can make an example of that ill be happy to test it.

KevinOConnor commented 5 years ago

In order for me to assist, I'll need the Klipper log file from the incident, and it will be necessary to issue an M112 within a few seconds of the error arriving. (@CappyT - your log is helpful, but unfortunately the M112 came a bit too late, if you can get a better log that would be great.)

-Kevin

tiaanv commented 5 years ago

OK... Here goes

First a disclaimer. This is NOT an effort to prove anything, it is simply an effort to provide as much information as possible. It is quite possible that the scientific method is failed in these tests. PLEASE don't rip my findings apart, I am making no conclusions here!!!

INTRODUCTION So I did a number of tests. Attached is a zip file with:

MARLIN This was the first test I did. Procedure followed:

The first video 1. Marlin.mp4 depicts the probe action. Both the Homing and the probing were successful. I did this test multiple times, and also at higher probing speeds. I could not get it to fail.

Right at the end of my tests I reverted back to Marlin, and did an M48 P20, just to get Marlin to do many probes, and it still had no issue. The is no video for that.

KLIPPER These tests were, unfortunately, a bit more messy, as I had to RESET the BLTOUCH, restart the Klipper service, and try to get useful information. I did multiple tests, some were successful, some failed on homing the Z-axis, some failed during the probing action. They failed inconsistently. Some at the start, some during, and some at the end. Procedure followed:

In some cases I needed

The order of these should be visible in the logs.

Videos: The first video test was done after homing successfully. I do a PROBE (failed), then RESET, and try the PROBE again. Video 2. Klipper A - FAIL and SUCCESS.mp4

The next test I did was a good one. The probe completed successfully. Video 3. Klipper B - SUCCESS.mp4

The final video test fails on the second probe action. Video 4. Klipper - SUCCESS and FAIL.mp4

Logs There are a few logs attached. The first log is a bunch of tests but has no M112, I only later did tests with M112. They are also attached.

The two main logs of interest: 2. klippy-Failed G28.log depicts a failed homing sequence. 3. klippy-Probe Failed after 14 good.log. This is a good one, as I manage to get 14 good probes, then a failure.

I hope all of this makes sense, and can somehow help. Let me know if I need to explain/clarify something.

PROBE TEST.zip

KevinOConnor commented 5 years ago

@tiaanv - Okay, your log indicates the root of the problem is message retransmits - almost certainly an error in the new STM32F407 USB support. I'll take a look at it. It seems unrelated to the original problem report in this issue.

-Kevin

Coffee0297 commented 5 years ago

i made some tests. the log shows: G28 BED_MESH_CALIBRATE and the probe failed M112 i hope this will help. klippy.log ill swich to the bltouch 3.0 and try the exsat same thing and see if i can get it to fail with that. (skr v1.3 bltouch 3.1)

heres a log of same setup but bltouch 3.0 and it faild a cuple of times klippy.log

tiaanv commented 5 years ago

@tiaanv - Okay, your log indicates the root of the problem is message retransmits - almost certainly an error in the new STM32F407 USB support. I'll take a look at it. It seems unrelated to the original problem report in this issue.

Interesting. Was not anticipating that. Thx Kevin.

Coffee0297 commented 5 years ago

And i've talked to one of the dev's fro marlin. he says antslabs if recomending to hold pwm on the probe at all times. (quote) we have talked about whether or not the pwm state needed to be held and Paris was quite insistent that it did. I also recently had some discussion with someone who is doing some comparisons between Marlin and klipper who noticed that they shut off the pwm and he found there is no difference in repeatability with it off.

KevinOConnor commented 5 years ago

@tonn0297 - both of your logs indicate the Klippy host software had some kind of internal race condition - after probing, the host software successfully queried the stepper position from the micro-controller, but for some reason didn't initially recognize the result. It re-queried a couple of hundred millseconds later, but by that time the bltouch went into an error state.

This error is not similar to the original problem report in this issue.

Unfortunately, I've been unable to reproduce this timing race locally. If you can easily reproduce the problem, try pulling the test code below, run the probe until failure, run M112 within a few seconds, then attach the log here.

It's important that the test be run with the pristine code - this command will revert your local changes - so if they are important, back them up first.

cd ~/klipper ; git fetch ; git checkout origin/work-debug-20190908 ; git reset --hard origin/work-debug-20190908 ; sudo service klipper restart

-Kevin

CappyT commented 5 years ago

@tonn0297 - both of your logs indicate the Klippy host software had some kind of internal race condition - after probing, the host software successfully queried the stepper position from the micro-controller, but for some reason didn't initially recognize the result. It re-queried a couple of hundred millseconds later, but by that time the bltouch went into an error state.

This error is not similar to the original problem report in this issue.

Unfortunately, I've been unable to reproduce this timing race locally. If you can easily reproduce the problem, try pulling the test code below, run the probe until failure, run M112 within a few seconds, then attach the log here.

It's important that the test be run with the pristine code - this command will revert your local changes - so if they are important, back them up first.

cd ~/klipper ; git fetch ; git checkout origin/work-debug-20190908 ; git reset --hard origin/work-debug-20190908 ; sudo service klipper restart

-Kevin

You are a legend. I'm currently returning my ender 3 pro for some other problems with it and i'm getting a i3 Mega-S instead. I will try klipper and bltouch on that, to see if the issue also exists on different printers (Mega-S is based on Trigorilla mainboard, which is RAMPS 1.4 + Mega 2560 in a single PCB). Then I will also run your test and attach both here ASAP.

Thank you very much for your help. Also, there's a donate button located somewhere? =D

Coffee0297 commented 5 years ago

Thank you Kevin. Im just about to finish a pro t and then ill swich to klipper and get some logs. Thank you for your help. Shuld i make a new problem so this thread is not filled?

Coffee0297 commented 5 years ago

@KevinOConnor heres a new log with the Git reset and all that

klippy.log

just to be sure here's my setup hardware. SKR V1.3 watterrott TMC 5160 spi 32mS Raspi 4 4gb octoprint v1.3.11 octopi v0.17.0 BlTouch v3.1 and 3.0 Genuine 24V psu

KevinOConnor commented 5 years ago

@tonn0297 - looks like a Klipper bug. Can you retest with the latest code on that branch to see if the problem goes away?

cd ~/klipper ; git fetch ; git checkout origin/work-debug-20190908 ; sudo service klipper restart

-Kevin

Coffee0297 commented 5 years ago

@KevinOConnor you are the man... ill make some more tests now just to make shure. what was the change?

Coffee0297 commented 5 years ago

So i just did 2400 probe no problem. so i think we are good. again thank you kevin for all your help and you input to the community. -Tonny

KevinOConnor commented 5 years ago

@tonn0297 - Okay, thanks. If you still have the log from the 2400 probe attempt, could you attach it here? (If not, then don't worry about it.) Feel free to use the workaround on the work-debug-20190908 branch. Unfortunately, it's not in a shape that can be committed to the master branch - I'll have to think about how to handle the underlying issue.

In summary, I'm aware of the following issues: 1 - The stm32f407 had a usb stability issue (should be fixed as of 4fa41d9c). 2 - There is a race condition on host to mcu query requests that could cause query responses to get delayed ( https://github.com/KevinOConnor/klipper/issues/1938#issuecomment-529255057 ). That delay could cause the bltouch to go into an error state. Work around on work-debug-20190908 branch. 3 - A low level retransmit could cause a delay in processing a query request. (Part of https://github.com/KevinOConnor/klipper/issues/1938#issuecomment-529101191 .) This delay also results in the bltouch going into an error state. 4 - The code attempts to clear the error state from the bltouch, however it can't clear the error state when the bltouch is close to the bed. Ideally, the code would raise the probe before trying to clear the error state.

-Kevin

Coffee0297 commented 5 years ago

@KevinOConnor sadly i dont have the log. but i can do a 20x20x3 probe when i get home and send you the log. And i can see i have some problems with the tmc5160 extruder resetting to default 256 microsteps and so on. but ill make a new issue when i know more. Kind regards Tonny.

jlirochon commented 5 years ago

Hi @KevinOConnor I did a 40 x 45 x 3 MESH_BED_CALIBRATE, and at some point I got the error !! BLTouch failed to raise probe. I'm on work-debug-20190908 branch. My board is a creality3d 1.1.4 (ender3) and I have a genuine bltouch 3.1. I issued a M112 a few seconds after the failed probe.

klippy.log

Coffee0297 commented 5 years ago

i think they says it has a failure rate of 5%. i just did a 20x20x3 again and i have no problem. heres a log klippy.log

KevinOConnor commented 5 years ago

@jlirochon - Unfortunately the M112 wasn't soon enough to get all the details. From what is there, nothing looks amiss in the logs (Klipper seems to respond with the pin_up command when it gets the notification). Otherwise, item 4 at https://github.com/KevinOConnor/klipper/issues/1938#issuecomment-529736513 still applies.

Separately, I wouldn't run the work-debug-20190908 branch unless you were specifically running into that problem.

-Kevin

jlirochon commented 5 years ago

@KevinOConnor thanks

The bed on the ender3 is cheap and warped, I was trying to probe it using a 5mm grid to see how bad it looks, but it always fail at some point. I tried work-debug-20190908 for this reason, after reviewing the changes you did on it. I must admit I don't know if it fall specifically under the race condition problem. I just know I can't complete a BED_MESH_CALIBRATE because it always fail at some point.

I have no prior experience with bltouch sensors, and don't know if it would work better using another firmware or if bltouch are just crap. I can try to reinstall marlin and see if it does better or not. I can try with another version of the bltouch too (I pinched an older one from an older printer).

Otherwise, item 4 still applies

I have some python skills and I'm glad to help. As your code is unfamiliar to me, any advice would be of great help.

nophead commented 5 years ago

I see the previous comments about Marlin doing constant PWM and klipper not. When klipper stops PWM does it ensure the last pulse is clean? If not I can imagine the BL touch acting on the runt pulse.

KevinOConnor commented 5 years ago

@jlirochon - Ah, I thought your bltouch was working (as in https://github.com/KevinOConnor/klipper/issues/1938#issuecomment-527993276 )? FWIW, your log appears similar to the log of the original report in this issue. Alas, both logs didn't get the M112 fast enough, but when I back tracked through the timing I didn't see any issue. It appeared Klipper responded as designed.

One thing you could test would be to reduce MIN_CMD_TIME in bltouch.py to 2 * SIGNAL_PERIOD. (Be sure to run sudo service klipper restart after any code changes.) It may be that the bltouch isn't happy with a 100ms turnaround time in some rare cases.

-Kevin

KevinOConnor commented 5 years ago

Hi @nophead,

I haven't seen an indication that there is an issue with the PWM signals. If the bltouch didn't get the signal then I'd expect to get reports of head crashes. All the logs and reports that I've seen indicate the bed stopped moving and therefore the bltouch successfully signalled the Klipper mcu code and the mcu code successfully stopped issuing step commands.

To answer your question though, the Klipper timing should be precise - exactly 5 pin_down commands over 100ms.

FWIW, I don't think we can leave the PWM signal enabled during probing. There were indications that leaving the PWM signal on was causing head crashes (and other errors) on some of the clone devices. Specifically, there was a concern that some clones would not detect a bed touch if the touch raced with a repeat reception of the "pin_down" command. Similarly there was a concern that some clones would become non-operable if they received a repeat reception of a "pin_down" command immediately after a bed touch event. Finally, in order to issue the pin_up command after a bed touch (on clones or original bltouchs), we'd need to ensure the pin_up command does not overlap with a repeat "pin_down" command - getting that timing correct is complex and would ultimately lead to further delays of the "pin_up" command.

-Kevin

EDIT: Actually the number of pin_down commands sent is dependent on the pin_move_time configuration, but the code should prevent any incomplete signals being sent.

KevinOConnor commented 5 years ago

@jlirochon

Otherwise, item 4 still applies

I have some python skills and I'm glad to help. As your code is unfamiliar to me, any advice would be of great help.

For item 4, my thinking is that it would require some changes to the high-level probing procedure. Right now, a probe event (as implemented in probe.py) involves moving the head down until receiving an "endstop" signal. In order to better handle the bltouch, this may need to change to something like: move down until trigger and then move up. In this way, probe.py could signal bltouch.py after the move up event, and bltouch.py could perform the reset request (if needed) after the head has moved up. (Right now, the bltouch.py code attempts to do the reset immediately after the bed touch event - while the head is still near the bed.)

Also, be aware of the general developer documentation at: https://www.klipper3d.org/Overview.html#developer-documentation

-Kevin

epandi commented 5 years ago

I hope it's the same problem as the rest describes, but I'm facing problems that sometimes the probe is being not pulled up after touching the bed, which results in the bltouch (v3.1) is going into alarm state. Im running the debug 0908 branch of klipper. klippy.log

Running Ender 5 with a pin 27 breakout board and resister c7 is desoldered from the board.

jlirochon commented 5 years ago

Hi @epandi,

I think I have found a workaround to this problem. At leat I fixed the problem for my printer, which is an ender3 with original creality3d 1.1.4 board.

My branch need some cleanup because I don't want to introduce bugs for other versions of BLTouch. If you are interested in testing this branch let me know.

What I did:

Not sure what's the difference exactly. According to antclabs documentation the probing sequence is similar, except for Push-Pin Down Mode they say "Caution: An alarm may occur". Klipper fails to clear the alarm but as it appears randomly it's difficult to debug. I'm waiting for more details from antclabs to know if Touch Switch Mode has some downsides. This mode is available from BLTouch Smart 2.1 up to current versions.

Honeybaadger commented 5 years ago

Hi @epandi,

I think I have found a workaround to this problem. At leat I fixed the problem for my printer, which is an ender3 with original creality3d 1.1.4 board.

My branch need some cleanup because I don't want to introduce bugs for other versions of BLTouch. If you are interested in testing this branch let me know.

What I did:

  • keeping PMW signal during the probe, blue led stays on until the pin touches the bed (the way it should be, as stated in antclabs documentation) => no effect
  • slightly adjusting frequencies, because they changed on v3.1 => no effect (as I expected because they have a tolerance margin of ±20 anyway)
  • probing with Touch Switch Mode mode instead of Push-Pin Down Mode => works for me !

Not sure what's the difference exactly. According to antclabs documentation the probing sequence is similar, except for Push-Pin Down Mode they say "Caution: An alarm may occur". Klipper fails to clear the alarm but as it appears randomly it's difficult to debug. I'm waiting for more details from antclabs to know if Touch Switch Mode has some downsides. This mode is available from BLTouch Smart 2.1 up to current versions.

im interested in getting my ender3 w/ BLtouch running :) how can i test it

jlirochon commented 5 years ago

@epandi @Honeybaadger @nophead @tonn0297 @CappyT

I have pushed my work in progress. Feedback is welcome, I would like to know if it works for you ! Feedback on the code itself may wait until I submit a pull request.

Quick instructions for those non familiar with git:

git remote add jlirochon https://github.com/jlirochon/klipper.git
git checkout -t jlirochon/feature/bltouch-improvements

Here is an example from my printer.cfg:

[bltouch]
flavor: genuine_smart_3.1
sensor_pin: ...
control_pin: ...
x_offset: ...
y_offset: ...
speed: 2
samples: 3
samples_result: median
sample_retract_dist: 5

flavor=xxxxxxxx tries to introduce sensible defaults for each flavor. You have to set a flavor in order to enable my patch (no flavor = backward compatible defaults = current behavior). You can still override as usual, for example you can set pin_up_touch_mode_reports_triggered: False in your config, but in my experience there is no need to change this for the genuine bltouch 3.1. If you have desperately tried settings you don't understand and observed no benefit, please remove them prior to testing, start with a minimalist [bltouch] config and see how it goes.

epandi commented 5 years ago

@jlirochon For me this change was a game changer. I don't have any problems using the bltouch 3.1 anymore. All attempts bed levelling (around 20 3x3 bed meshed) succeded without any problems.

HaythamB commented 5 years ago

@jlirochon Your branch solved the issues for me. Anycubic Mega, running TMC2208 (no UART), Trigorilla board. Sensor pin connected to ar19. The master klipper branch just wouldn't take the 3.1 Smart no matter what. I think your changes are close to what is the best solution at the moment. It still sometimes fails, and I still must switch both trigger options to False for it to work properly. But it works.

jlirochon commented 5 years ago

@HaythamB I appreciate your feedback, thank you !

It still sometimes fails, and I still must switch both trigger options to False for it to work properly. But it works.

Interesting... I would like to understand why 2 instances of the same BLTouch would require different parameters. This challenges the concept of flavor I want to introduce.

I hope we can diagnose a problem and make it work with consistently with the default parameters.

HaythamB commented 5 years ago

@jlirochon it is puzzling indeed. I ran 3x3, 5x5, and 7x7 validations, and they tended to fail on the 5th or 6th probe. Some tries I'd get a full 7x7 sweep with no issues, other times I'd get a 3x3 fail on the second probe. Its definitely not consistent.

I have a feeling it has to do with the timings of the PWM in klipper, since the raspberry pi is not a real-time OS and may be causing some sort of mcu timeout?

Here's what you asked for:

Here's the BLTouch section from my config file:

[bltouch]
flavor: genuine_smart_3.1
sensor_pin: ar19
control_pin: PB5
#pin_move_time: 0.4
pin_up_reports_not_triggered: False
pin_up_touch_mode_reports_triggered: False
x_offset: 44
y_offset: -8
z_offset: -0.260000
speed: 2
samples: 3
BlackStump commented 5 years ago

@HaythamB

z_offset: -0.260000 is that correct? my offsets are 1.1 -0.26 is having the nozzle below bed height

It's a good idea to verify that the Z offset is close to 1mm. If not, then you probably want to move >the probe up or down to fix this. You want it to trigger well before the nozzle hits the bed,

HaythamB commented 5 years ago

@BlackStump yeah I've been having issues with my mount, its not the best. Had to add a 3rd M3 nut to get it close, now the probe (when retracted) is about 0.4mm higher than the nozzle tip. I need some washers instead to increase this gap. Thank you for pointing this out.

@jlirochon This is the reason I was still getting timeouts, the probe needed more room to be triggered. Seems like its working perfectly now. Thanks for the great work!

jlirochon commented 5 years ago

@HaythamB

Seems like its working perfectly now

Great ! were you able to remove pin_up_reports_not_triggered and pin_up_touch_mode_reports_triggered ?

HaythamB commented 5 years ago

@jlirochon so haven't been able to get it to work properly. Modified the offsets correctly, but the timing is still off.

Flashed in Marlin 2. Works perfectly out of the box with the same offsets. Standard 3.1 settings there (such as 5V logic level) on the exact same pins I had in klipper. Interruptable endstops option is not enabled either, just in case.

From what I can tell, when the probe starts to hit the bed, it lights up red (detecting the bed correctly), but there's approx a 2 sec delay until the sensor pin is detected in klipper, during which time the pin travels quite a lot into the probe (based on our Z speed, ofc). In Marlin, the pin triggers the sensor almost instantly (like any YouTube video of a BLTouch running normally would show).

Seems to me that there's still a timing detection issue happening in klipper. Leaving Marlin flashed for the time being until we can figure this out, cause I really want to try out the BLTouch! :)

Guess there's still more work ahead. Your solution is about 90% of the way there. There must be come timings to tighten up to have it nailed.

epandi commented 4 years ago

Just to give an update. I still don't have any issues after running @jlirochon proposed fix. So for my problem (sometimes the pin is not pulled up after the probe touching the bed) it's rock solid.

KevinOConnor commented 4 years ago

@jlirochon - FYI, Klipper used to use "touch_mode" during probing, but there were several reports that it reduced the probing accuracy (on clones and on the v2.0 and earlier bltouch).

-Kevin

jlirochon commented 4 years ago

@jlirochon - FYI, Klipper used to use "touch_mode" during probing, but there were several reports that it reduced the probing accuracy (on clones and on the v2.0 and earlier bltouch).

Thanks @KevinOConnor, there are several things involved:

I introduced a concept of flavors on my branch. For now I have flavors for genuine bltouch because documentation is available, but there is plenty of room for clones as well.

I have a BLTouch Smart v3.1 and a BLTouch Classic (need to check which version). If someone wants to send me some genuine models or some clones, they will be used to do more extensive tests. I will buy 1 or 2 cheap clones anyway.

KevinOConnor commented 4 years ago

FYI, as far as I know, touch_mode has been present since the very first bltouch.

-Kevin