Closed samcoinc closed 10 months ago
ok - if you look at the 1st and 3rd images - the actual spindle rpm is the same. But the encoder is reporting it incorrectly..
5.2rps (ish) vs 3rps(ish)
something is odd..
Ok - yes - you can see the rps is different when it isn't seeing the index correctly (like it isn't taking into account the missing tooth..) The encoder velocity is much higher. (the pwm to the spindle drive is the same - after increasing the spindle rps to make the encoder to see the index correctly)
here are the encoder settings 37 bit I/O TRUE encoder.2.counter-mode 37 s32 OUT 1840 encoder.2.counts 37 s32 OUT -95731 encoder.2.counts-latched 37 bit I/O FALSE encoder.2.index-enable <=> indexenable 37 bit IN TRUE encoder.2.latch-falling 37 bit IN FALSE encoder.2.latch-input 37 bit IN TRUE encoder.2.latch-rising 37 float IN 1 encoder.2.min-speed-estimate 37 s32 IN 1 encoder.2.missing-teeth 37 bit IN FALSE encoder.2.phase-A <== phase-a 37 bit IN FALSE encoder.2.phase-B 37 bit IN FALSE encoder.2.phase-Z 37 float OUT 18.4 encoder.2.position ==> spindpos 37 float OUT 18.40904 encoder.2.position-interpolated 37 float OUT -957.31 encoder.2.position-latched 37 float I/O 100 encoder.2.position-scale 37 s32 OUT 97571 encoder.2.rawcounts 37 bit IN FALSE encoder.2.reset 37 float OUT 2.738226 encoder.2.velocity 37 float OUT 164.2935 encoder.2.velocity-rpm 37 bit I/O TRUE encoder.2.x4-mode
I don't know if I should start a different issue - or add this here.. I think it also loses 'position..' over time..
I am running a program that is a bunch of chained G33 moves.. (kinda making a oil grove..)
It seems that the position slowly moves in relation to the actual index. Now - I know this works OK on mesa hardware as I have done some non-circular turning which can have many many rotations after an index is seen. (thousands) this starts losing after a few hundred. Could it be in the calculation? Or something I am doing - I am running the spindle slow <1rpm.. I don't think it is a counting problem - although I don't know how I would compare the 'counts' to the position - as there is a missing tooth :)
If you look at the screen shot - the mouse pointer is showing the position (look at the lower encoder 'position' value. You can see the mouse compared to the index is moving to the left. (slowly losing position..)
well shit - you can't see the mouse pointer. Let me try this again.
ok - I have the 'dot' for the encoder position trace showing where the position is compared to the gap. You can see it moving to the left. I really don't think I am losing counts.. But I don't know of a way of testing for that. Let me think..
Ok - I setup 2 encoders (one with scale of 99 - no missing tooth) - both encoder counters lose position the same.. I am pretty sure it is a hardware issue - losing a count every so often. Have not caught it yet... Disregard the above ramblings about losing position...
Actually - could the position calculation not be high enough precision? I am really having a hard time proving that this is a missing count.. If I run a threading operation - .1 pitch over 1 inch. I have never seen a loss of spindle position.. Not one tooth. But if I run a bunch of chained together G33's - the position slowly moves (as you see above in the screen shots) the higher the rotations goes..
Look at the rawcounts at the index over several rotations?
I think I found the problem. Finally broke the scope out and set it to pulse triggering..
This could also be the issue with the index at slow speeds.. I will test. I may have been the one that caused this problem..
doesn't help low rpm gap detection though.. So the original issue is still an issue.
so far sam
So, you have a spurious pulse on the line? I don't fully understand the logic for the missing tooth gap limit, though I imagine that I once did.
/* decide how many ns to detect missing-pulse index */
cntr->limit_dt *= 0.9;
cntr->limit_dt += 0.1 * ((*(cntr->missing_teeth) + 0.5) * (delta_time / delta_counts));
Are you able to convert cntr->limit_dt to a hal parameter and plot what it is doing?
I filtered out the pulses - the encoder position now doesn't seem to drift. but at low rpm - it doesn't find the gap still..
I should be able to handle that... Lol.. Let me try.
well - that is an s32 - do we need to do something to it? The images (first is it not seeing the gap correctly - second it is)
It also doesn't seem to calculate the rpm/rps correctly too - you can see from the first batch of screen shots..
limit_dt is the length of gap required to trigger the missing tooth. It's in nanoseconds. I was worrying that S32 might not go high enough, but it should be OK up to a pulse time of 2 seconds. It looks like max_dt is 18ms in the bottom image, what is the actual pulse gap?
Here is another screen shot - the index gap is about 21.5ms or there about...
when you first launch linuxcnc - and run the spindle slow - (where it doesn't see the index properly.. You get a maxdt of 10.3ms.. but you can see the index gap is the same 21.5ms..
I honestly can't believe that I got the limit_dt as a pin.. It took 2 trires - looking at how '*' is used by other variables. (I don't even know if variables is the correct term..)
Also - when it isn't working - Remember that the velocity (rpm and rps) isn't correct...
sam
here is the rpm.. It is actually turning 43ish rpm.. You can see when it isn't seeing the index properly - the rpm is almost 2X
It's a bit hard to pick numbers off of the screen, but it rather looks like limit_dt is being set to less than the normal pulse gap, so every gap looks like a missing pulse. Is that what the numbers show?
That is how I would interpret it. And some how related to rpm/rps..
Andy! I am not in front of the lathe - but could it be as simple as my logic on the 'missing tooth' is wrong? You can see my 'missing tooth' is a logic 1 gap.. Is the code expecting a logic 0 gap? 'MISSING' tooth?
sam
alas - no. Wouldn't be that easy..
I have an omron 100ppr encoder hooked on the A, B and Z, shielded pure copper wiring, terminating resistors and schmitt triggers for the 3x signals. I not getting a single missed pulse (2.9-latest). I'd suggest to rule out the hardware you have on the benchtop first, then try to reproduce the fault in the system/hardware you are running.
What happen when you inject a pulse from a MCU or Function Generator to LinuxCNC?
I am 99.9% sure this isn't a noise issue.. I have dbounced the crap out of the signal - still acts the same.
here is something that may or may not be helpful.. If I do a step change from 43rpm to 430 rpm - the missing tooth index starts working...
If I slowly raise the rpm to 430 rpm - it keeps missing the gap (and reports the wrong rpm)
sam
If you are confident you are noise/glitch/spike/joint/pad/vibration free on the hardware side, time to get the Function Gen out and try injecting a pulse to the input to rule out your software-side and installation. If you don't have a Function Generator you can rig something quick using a STM32, Arduino or a ESP32 for the same purpose.
Really wish I'd be of any better help. Dislike seeing others having issues while it should be running fine for them too.
Good luck and all the best. ps. Can we see the hall sensor (datasheet/timing/specs) and the slot width?
Weeeee... I created a hal component that takes the sim encoder and creates a missing tooth phase.. (it isn't pretty...)
So - if this behaves the same as my physical encoder - I should be able to create a sim config.. If...
ok - I can get it to do it with a strictly hal implementation...
I will try to get to this on Sunday.
Ok - here is the Hal file, Hal component and some directions..
Install encodergap.comp
put missingtooth.hal and autosave.halscope in the same directory..
Then run.. halrun -I missingtooth.hal setp sim-encoder.0.speed .6 setp encoder.0.index-enable true
You will see like the image below that the index isn't found at the gap..
If you then
setp sim-encoder.0.speed 2 setp encoder.0.index-enable true
It will start seeing the index as expected..
here is the encoder.c with limit_dt as a pin (if I did it right - seemed to work)
Sorry, I should have updated you on this. I am pretty close to a fix. (Well, I have a fix but want to understand why it works)
Sorry, I should have updated you on this. I am pretty close to a fix. (Well, I have a fix but want to understand why it works)
Cool! I can certainly test it when you have a handle on it.
thanks!
I am starting to doubt my sanity on this one.
The issue is the quick-and-dirty lowpass filter: val = 0.9 val += 0.1 new_val
https://github.com/LinuxCNC/linuxcnc/blob/master/src/hal/components/encoder.c#L485
If I delete the lowpass and just use the previous gap then it works, but I worry that is possibly unstable with glitchy real-world data.
At 100 teeth and 100 rpm I get a limit_dt of 9,000,000 without the filter With the filter it is about 4,500,000
So I tried only dividing by 10 (and by 10.0, and multiplying by 0.1) and get a value of 450,000
It's like /10 is being interpreted as /20...
And this is with (as far as I know) all the other interactions with the value if limit_dt disabled.
I could just remove the lowpass, but I want to understand.
The penny has dropped..
If limit_dt is too small, then the counter sees a gap every time. And adds the missing teeth in. Which doubles the tooth count, halving the calculated value of limit_dt, keeping limit_dt too low... This also explains the hysteresis. I need to "prime" the limit_dt lowpass.
[andypugh]
I am starting to doubt my sanity on this one.
The issue is the quick-and-dirty lowpass filter: val = 0.9 val += 0.1 new_val
I suspect you want to work with floating point numbers while doing this calculation. The code in question will in effect do this:
hal_s32_t missing_teeth; hal_s32_t limit_dt; rtapi_s32 delta_counts; rtapi_u32 delta_time; limit_dt = 0.9; limit_dt += 0.1 (((missing_teeth) + 0.5) * (delta_time / delta_counts));
Several of the calculations are done using whole number types, and I suspect the result is not what you expect.
Perhaps something along these lines solve the problem:
hal_s32_t missing_teeth; hal_s32_t limit_dt; rtapi_s32 delta_counts; rtapi_u32 delta_time; / Use floating point calculations when filtering / double newval = limit_dt; newval = 0.9; newval += 0.1 (((missing_teeth) + 0.5) (1.0delta_time / delta_counts)); limit_dt = newval;
I believe this ensure floating point calculations are used when multiplying, adding and dividing.
-- Happy hacking Petter Reinholdtsen
[andypugh]
The penny has dropped..
If limit_dt is too small, then the counter sees a gap every time. And adds the missing teeth in. Which doubles the tooth count, halving the calculated value of limit_dt, keeping limit_dt too low... This also explains the hysteresis. I need to "prime" the limit_dt lowpass.
I suspect this is the integer math messing things up. I am just pulling numbers out of thin air here, but lets say that *missing_teeth = 1, limit_dt = 2, delta_counts = 3 and delta_time = 4. This calculation:
limit_dt = 0.9; limit_dt += 0.1 (((missing_teeth) + 0.5) (delta_time / delta_counts));
will then look like this:
limit_dt = 2 0 / (int)0.9 = 0 / limit dt = limit_dt + 0.1 (1 + 0.5) 1 / (int) 4/3 = 1 */
The value assigned is (int)0.15 which is zero, so limit_dt=0. While if double is used during calculation, the result will be limit_dt=2.
-- Happy hacking Petter Reinholdtsen
wgat
[andypugh] I am starting to doubt my sanity on this one. The issue is the quick-and-dirty lowpass filter: val = 0.9 val += 0.1 new_val I suspect you want to work with floating point numbers while doing this calculation. The code in question will in effect do this: hal_s32_t missing_teeth; hal_s32_t limit_dt; rtapi_s32 delta_counts; rtapi_u32 delta_time; limit_dt = 0.9; limit_dt += 0.1 (((missing_teeth) + 0.5) (delta_time / delta_counts)); Several of the calculations are done using whole number types, and I suspect the result is not what you expect. Perhaps something along these lines solve the problem: hal_s32_t missing_teeth; hal_s32_t limit_dt; rtapi_s32 delta_counts; rtapi_u32 delta_time; / Use floating point calculations when filtering / double newval = limit_dt; newval = 0.9; newval += 0.1 (((missing_teeth) + 0.5) (1.0*delta_time / delta_counts)); limit_dt = newval; I believe this ensure floating point calculations are used when multiplying, adding and dividing. … -- Happy hacking Petter Reinholdtsen
I did this..
if (delta_counts) {
double newval = *cntr->limit_dt;
newval *= 0.9;
newval += 0.1 * ((*(cntr->missing_teeth) + 0.5) * (1.0*delta_time / delta_counts));
*(cntr->limit_dt) = newval;
}
but it didn't seem to help...
The numbers are so big that integer / float isn't super-important. limit_dt and delta_time are in nanoseconds.
I think I have fixed it, can you test on hardware? I tested in sim between 1 rpm and 10,000 rpm.
I think I have fixed it, can you test on hardware? I tested in sim between 1 rpm and 10,000 rpm.
I will tonight!
Testing - so far so good!
Great work! sam
Here are the steps I follow to reproduce the issue:
100 line single channel encoder with 1 missing tooth.
The first screenshot shows the index enable not going false in the correct location.
the rest of the screen shots show the index enable going false correctly after increasing the rpm to 320 - then lowering it - show it still works correctly - even at 120 rpm (the scope is in rps)
It worked properly before this:
I have tried master and 2.9 - both exhibit this issue..