pybricks / support

Pybricks support and general discussion
MIT License
109 stars 6 forks source link

[Bug]ev3 micro-python Release R3 Motor problem #7

Closed johnscary-ev3 closed 4 years ago

johnscary-ev3 commented 4 years ago

Looks like when I do a sound file playback and a motor fixed angle move at the same time, the motor can go crazy at high speed and not stop at requested angle.

My application is multi-threaded and so sound playbacks and motor moves can happen at same time. This worked fine in earlier LEGO release of micro-python.

Is this a bug?

laurensvalk commented 4 years ago

Can you share a short example that reproduces this?

If anything, sound and threads should work a lot better in 2.0, although officially we still do not support threads.

All 1.0 methods for sound should still work today, but there is a much improved sound interface in the EV3Brick() class, see https://docs.pybricks.com/en/latest/hubs.html

johnscary-ev3 commented 4 years ago

Hi guys, I think threads are very useful for this real time robot work. Anyway I have uploaded my entire project to my GitHub pages under EV3RSTORM. Hopefully I did it correctly so you can see my files. I am still new to GitHub. I use three threads for doing a scanning Sonic sensor head to detect objects. One thread moves the head (actually I use two heads). One thread looks for objects. And the third one plays a sound file if an object is detected. That allows the heads to keep moving while the sound plays. Also I can turn off the scanning but still do object detection.

The threads are in object_tools.py object_detect(): object_sound_thread() and in motion_tools.py scan_head_thread():

This works ok on the LEGO python release I was using. With R3 system this works ok until an object is detected and the sound file plays. Then the scanning motor in motion at the time takes off at high speed. Sort of exciting, but not healthy for the robot ;0)

Please let me know if you need more details.

johnscary-ev3 commented 4 years ago

Update on this issue: I found while working on upgrading my program to Rev3 for new IR button operation, that even motor moves in the main thread of the program go crazy when a sound file is played in my "object_sound_thread" routine. Not just motor moves in "scan_head_thread" routine.

By the way, the IR buttons are working fine now ;0)

laurensvalk commented 4 years ago

I think threads are very useful for this real time robot work.

Agreed -- We've made major under the hood improvements for threading for 2.0. I see you are using the ev3brick module. Can you try using the new ev3 = EV3Brick() class for sounds and display?

As mentioned earlier, the old ev3brick should still work, but only for the documented API, which did not have threading yet.

Here is another thing that would be useful to know. The motors now have a (hidden) builtin data log feature. It's used like this.

mtr = Motor(Port.B)

mtr.log.start(10*1000) # Allocate log buffer for 10 seconds

# Make the motor move, for example:
mtr.run_target(400, 1500)  

# feel free to do other things here

mtr.log.save('log.txt')  # Save the log file

You can now upload the file to your computer. Can you share the log file that captures where the motor goes crazy?

johnscary-ev3 commented 4 years ago

Thanks for the help.

Good news. Going to new "EV3Brick" class solved the problem of motor moves and playing sound files in threads conflicting and causing crazy motor motions ;0)

Bad news. Now I can only play sound files in one place at a time. If I play a sound file in a thread at same time as main, I will get an error ;0( Previously in LEGO release version this was ok, and the system just waited for one sound file to finish before playing the next one.

Error is: RuntimeError: Playing file failed: aplay: main:788: audio open error: Device or resource busy

laurensvalk commented 4 years ago

Going to new "EV3Brick" class solved the problem of motor moves and playing sound files in threads

Excellent!

Now I can only play sound files in one place at a time.

But that probably makes sense :)

I will get an error ;0(

Ah, that's not so nice. Thanks for reporting it!

I think what we could try to do, is fix this in the same way as we deal with parallel access to other devices like motors and sensors. If we could make that work, it would not give errors, but behave like this:

As soon as you give a new command, the older command is stopped. So in this case, the ongoing sound would stop and the new one would play.

laurensvalk commented 4 years ago

So, we might not be able to change this for the 2.0 release, but here is an alternative you can try:

You can use the try / except to test if you can start a new sound. That way you will know that a sound is playing.

dlech commented 4 years ago

Previously in LEGO release version this was ok, and the system just waited for one sound file to finish before playing the next one.

This behavior was not intentional but it should be easy to recreate.

# create a global lock object
sound_lock =  _thread.allocate_lock()

# then use this function to play sounds
def play_sound_file(file):
    sound_lock.acquire()
    try:
        ev3.speaker.play_file(file)
    finally:
        sound_lock.release()

EDIT: I forgot, there is a easier way to write this (does exactly the same thing):

def play_sound_file(file):
    with sound_lock:
        ev3.speaker.play_file(file)

And really, this pattern should be used when any resource (screen, motor, sensor, etc.) is used from multiple threads as Pybricks is not thread safe.

johnscary-ev3 commented 4 years ago

Thanks guys. Really appreciate the help and advice. I was also thinking of using a "Lock" flag or using "Try" to catch the exception. Your code above seems to use both methods at same time. Is that to be double sure just in case something still goes wrong in "play_file" method?

I will try this code and let you know how it goes.

By the way, it might be good to add something to the docs on doing multi threaded programs eventhough it may not be officially supported. People, like me, are going to try it anyway ;0)

Thanks again.

dlech commented 4 years ago

Is that to be double sure just in case something still goes wrong in "play_file" method?

Sort of. try/finally doesn't actually catch any exceptions. And we don't expect an exception to happen, but just in case it does, we can be sure the that lock is released no matter what this way.

johnscary-ev3 commented 4 years ago

Hi again, The lock code for sound, totally worked and I have no more problems with that.

However I have noticed that I still have some smaller target angle motor move issues. Going to the new library definitely reduced the severity of that but it looks like occasionally I still see motor moves that don't complete correctly. I think there are other operations besides the sound plays that are interfering with motors such as sensor reads that are also going on. I temporarily stopped sound plays and still see motor problems. I suppose I could try to use more of those thread locks for motor moves and the other items going on. Or maybe just drop the multi threading and use a single thread approach.

Do you have any future plans to support this multi threaded approach? Thanks for all the help.

laurensvalk commented 4 years ago

Can you describe those motor issues? What do you mean when you say it does not complete correctly?

If you give the motor a new command from a different thread, it should start doing the new command right away.

It would be great to see the log file of any potential motor problems.

Thanks again for your feedback.

On Sat, 4 Apr 2020 at 06:31, johnscary-ev3 notifications@github.com wrote:

Hi again, The lock code for sound, totally worked and I have no more problems with that.

However I have noticed that I still have some smaller target angle motor move issues. Going to the new library definitely reduced the severity of that but it looks like occasionally I still see motor moves that don't complete correctly. I think there are other operations besides the sound plays that are interfering with motors such as sensor reads that are also going on. I temporarily stopped sound plays and still see motor problems. I suppose I could try to use more of those thread locks for motor moves and the other items going on. Or maybe just drop the multi threading and use a single thread approach.

Do you have any future plans to support this multi threaded approach? Thanks for all the help.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/pybricks/support/issues/7#issuecomment-608971885, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6BKYMVOM6IGNKYSSNKMTTRK2Z3RANCNFSM4LWIRNBQ .

johnscary-ev3 commented 4 years ago

Hi, I tried to make a video of what I am seeing. It is sort of intermittent now. Link below. There is one thread doing the scan head motor moves using run_target moves with no wait. So the two scan heads appear to move at same time back and forth stopping at 5 different angles. While that is happening another thread is reading the distances from both sonic sensors. If an object is detected (<300mm) then the Main thread changes the light to Orange from Green. Also the main is reading the Buttons on IR sensor to take commands. A lot going on ;0) In the video you should see mostly good head scans but one glitch when motor wanders in wrong direction and then gets back on track with next move command. It is not too bad, but seems to show one of the moves failed to execute. I still need to try the log file thing. Thanks.

https://github.com/johnscary-ev3/EV3RSTORM/blob/master/IMG_9819%201.mov

johnscary-ev3 commented 4 years ago

Update: I made a few changes and the random motor issues appear now to be gone:

Made the two scan motor moves by doing "No Wait" on first and then a "Wait" on second. So heads move at same time but my program knows exactly when it is done. Changed to COAST stops instead on BRAKE. Not sure if that made any difference. Used the thread locking to synchronize the head moves and the sonic sensor reads. After the heads stop moving, I unlock the Sonic sensor reading which is still in a separate thread. I suspect that maybe there were just too many Sonic sensor reads going on. Now just reads distances between head moves which is fewer reads for the system to execute and less for system to do simultaneously.

Anyway this is working much better now. I really appreciate all the help on this.

Probably we can close this issue unless you have some additional comments for me.

laurensvalk commented 4 years ago

Probably we can close this issue unless you have some additional comments for me.

While I'm glad you solved it for your project, I would still really like to know the underlying cause - if you want to help a bit with the following, that would be awesome.

Mainly, I would be looking to reproduce the problem on my end. Ideally without building a copy of EV3RSTORM. Perhaps I could just plug in the same motors and sensors as you.

So it would be great if you could you tell me:

This can perhaps all be found in your code, but you know your code best. I appreciate the feedback :)

Thanks!

johnscary-ev3 commented 4 years ago

Ok great. I think what I will do is take my code, which includes many options for things like following lines, finding IR Beacons, Gyro based turns, etc., and make a reduced version that just does the sonic head scans and plays sounds using the threads. Then it will be easier to work with. I am using the medium motors for these sanning heads and they have a hard stop used when I initialize the program to set a zero degree reference point. After that init, I use run to target angle to move the motors. I will take some photos of that. The reduced version will allow one or two heads like I have, and will list all the ports for you. Also I will keep the IR button reading in the main since that is also running while I do the sonic head scans. Will update you when I have this new version ready. Thanks for your interest in digging deeper into this.

laurensvalk commented 4 years ago

Thank you, a reduced version sounds like a good plan!

johnscary-ev3 commented 4 years ago

Hi again, I created the "reduced" version of my EV3RSTORM program called EV3R_SCAN. Link below. It just does the scanning heads and object detection using threads. I created a readme file with some details and uploaded a number of photos of both the right and left heads I built. They are not quite identical since I was running low on LEGO parts ;0) The program runs fairly well now with all the changes we made but you may still see occasional "jerky" motions or the head miss a target angle and "wander" between moves. Let me know if you need more info. Thanks for all the support.

https://github.com/johnscary-ev3/EV3R_SCAN

laurensvalk commented 4 years ago

Thank you! I'll have to look at that. If you want to try and produce that log file, that would still be quite helpful too.

johnscary-ev3 commented 4 years ago

I will work on that motor log file also. By the way I noticed that my full function software tends to "glitch" more when I play sound files in the main loop compared to the reduced version. So I am going to add a play sound files option to reduced version also to see if that makes a difference.

johnscary-ev3 commented 4 years ago

Hi again, I modified my EV3R_SCAN program to include playing sound files in main loop and also doing the motor data logging as you mentioned. Not both at same time, but maybe I sould do that next. Anyway, I recorded 10 seconds on both A and D motors while scanning. The log files are in: https://github.com/johnscary-ev3/Data_logs

Do you have any documentation on how to interpret the contents of these log files? Thanks.

laurensvalk commented 4 years ago

Thank you for sharing! When you created these files, did you see the issue with the motors? The logs look quite good to me. I don't see the issue in this experiment.


Some info about the data

The first plot shows the speed (rate of change) of the motors. The second shows the angle counter.

The EV3 gets two "counts" per degree, so that is why everything is twice as big.

Blue lines are the measured values. The orange lines are the references.

While you are braking/coasting (Stop.BRAKE, Stop.COAST), there is no active control going on, so that is why the orange line jumps to 0. If you omit the argument (it defaults to `Stop.HOLD), the lines would continue. (Brake/coast is fine though, I just wanted to explain why the lines are 0 there.)

The very last plot shows how long the control loop (also on a thread) takes. It in the order of 10 ms on average, so that looks all fine too.


But in other words, both the blue and the orange line seem to do what you want in this test.

If you want to try to reproduce this file in your original, bigger program, that would be interesting.

Basically, I'm wondering if there is a math problem in the Pybricks code (then, the orange lines would be wrong), or if the motors have problems following the reference (then, the blue lines would not follow the orange targets.

logA April 9 2020

image

logD April 9 2020

image


And these are the columns. The plots are just columns from the log file:

count = [
    ('count', data[:, 2]),
    ('count_ref', data[:, 6]),
]

err = [
    ('err (either count/rate)', data[:, 8]),
]

err_integral = [
    ('err integral (either count/rate)', data[:, 9]),
]

rate = [
    ('rate', data[:, 3]),
    ('rate_ref', data[:, 7]),
]

control = [
    ('duty', data[:, 5]),
]

clock = [
    ('maneuver clock', data[:, 1]),
]
johnscary-ev3 commented 4 years ago

Wow! Thanks for all the data analysis and explanations. This was mainly a test run of the motor data collection. I don't think I saw any glitches during this 10 seconds which is basically one full cycle of the scan heads. I think I will need add some more activity back into the main loop to see if I can cause conditions that may cause a glitch and then data log it. What is a reasonable period of time to data log without running out of memory space?

laurensvalk commented 4 years ago

We've designed brand new motor drivers for Pybricks, so we've been looking at these kind of graphs a long time to ensure everything is working - so plugging in different data is no problem :)

You can set it to a minute before getting an error that says you can't make it bigger.

If I have to find one rare phenomenon like yours, I usually make logs of 60 seconds in a loop. And when the phenomenon occurs (and you wait for the minute to ends so the log is saved), I stop the program, and observe the logs.

I think I will need add some more activity back into the main loop to see if I can cause conditions that may cause a glitch and then data log it.

Yes, sounds good. Maybe you can just add the log to your original program with all the threading - that should be fine. It would be best to make the log in the same thread in which you already use the motors.

johnscary-ev3 commented 4 years ago

I modified my EV3R_SCAN program to keep doing 60sec logs and saving it on top of old file. Then if I see something I wait for log to complete and then stop the program, as you mentioned. I captured one set of log files after I saw some " jerky" motions. I was playing sound files from main loop while scan was in progress. The files are in the same GitHub spot. Link below. I also put the data into a spreadsheet and did some plots. On motor A, there seem to be some "events" at 30K and 50K on x-axis. At 30K it looks to me that there is a big overshoot and motor is driven back to correct position fast. At 50K the motor seems to be at correct position but then is driven out of position for some reason. Also there are similar but smaller events at same x-axis points on D motor. I am sure you will be able to see more in this data. I will run this some more and see if I find any more significant events to study. I aslo updated my EV3R_SCAN program on GitHub to latest one I ran. Thanks, https://github.com/johnscary-ev3/Data_logs

laurensvalk commented 4 years ago

Thanks for sharing, that's very helpful. Indeed, most of the time it is going well. I see an issue where the orange reference line is not correct. In turn, this causes the motor to "overcompensate" and you see a jerky movement.

image

The orange reference lines are calculated as a (smooth) function of time. However, there seems to be an error in the time. They should all be triangles but the middle one is not:

image

Normally, I would have argued that the problem is in the algorithm, but it's suspicious that it happens to motor A and D at the very same time.

So, digging further, here is the plot of the time between two samples. It should be around ~10 ms. But in your sample, it has big spikes. And occasionally this happens during the motor command, and that throws it off:

So in other words, the triangle in the plot is not wrong - it just doesn't get any samples in the mean time.

image

The time between the spikes, does that match anything on the threads? For example, do you play a sound at those intervals?

johnscary-ev3 commented 4 years ago

Thanks the detailed analysis of this data. There were sound files playing in an object detect sound thread and also in main loop. One is from object detects and other was just busy work in the main. The sounds are coordinated so they don't interfere with each other using those thread locks you mentioned. However the sounds and motor moves are not coordinated. Does playing sound files move a bunch of data that could take CPU time away from the motor sampling and cause this? It is useful to play sound files and do motor moves at same time so the robot can "walk and talk" at same time. Do you have any suggestions to prevent these motor sampling drop outs? Suppose I could also do a thread lock between these target angle motor moves and playing sound files. But that might look funny since the sounds would delay the moves I think. Thanks.

johnscary-ev3 commented 4 years ago

Hi, I found a more interesting motor "event" where the D motor "wandered" away from target between the moves. At about this time two sound files played about detecting an object. Other sound files were not playing. From the delta_t plot those spikes are likely the sound files playing but they seem to be after the "wander" event. Any ideas on this one? The data is in my same GitHub location and are denoted as the A2 and D2 files. Thanks,

image

laurensvalk commented 4 years ago

Huh, that’s interesting! Thanks for the update and for making the visualization already.

I suspect that the brake function may have failed. 🤔 I will have to look at the data to be sure.

Are the motors under mechanical load or can they spin freely?

johnscary-ev3 commented 4 years ago

Motors have those ultra sonic sensors mounted on them but are free to move otherwise. There is a mechanical stop so I can set the zero point, but not sure if the head hit that in this event.

laurensvalk commented 4 years ago

Good to know, thanks. It probably didn’t hit it initially as the count value keeps increasing.

Would you mind trying the latest beta build to see if things have improved? We did change a few things to improve error handling of the break function, which should mean that this no longer should occur.

johnscary-ev3 commented 4 years ago

Ok. I have the 3/23 version image but looks like you have a 4/10 version now ;0) Should I try that?

========== ev3dev-sysinfo ========== Image file: ev3dev-stretch-ev3-generic-2020-03-23 Kernel version: 4.14.117-ev3dev-2.3.5-ev3 Brickman: 0.10.3 BogoMIPS: 148.88 Bluetooth: 4.0 Board: board0 BOARD_INFO_HW_REV=8 BOARD_INFO_MODEL=LEGO MINDSTORMS EV3 BOARD_INFO_ROM_REV=6 BOARD_INFO_SERIAL_NUM=001653655C7F BOARD_INFO_TYPE=main

dlech commented 4 years ago

Should I try that?

yes

johnscary-ev3 commented 4 years ago

Looks like you have a "buster" image as well as the "stretch" version. Could not find anything that tells me about the difference? Thanks.

dlech commented 4 years ago

use stretch

johnscary-ev3 commented 4 years ago

Hi, I updated to ev3dev rev 4/10/20 and ran my software again. Unfortunately I was able to capture another of these "wander" events with new rev. See below. This was running some sound files while moving motors. Also noticed that sometimes the motors would go wild (high speed moves) when saving the log and starting a new log. Did not see that before. Data is in my GitHub location as A3 and D3 file names.

https://github.com/johnscary-ev3/Data_logs

image

laurensvalk commented 4 years ago

Thanks for sharing. Below is the full version of A3 with a close up around 9-11 sec.

A few thoughts:

image

johnscary-ev3 commented 4 years ago

Actually I was using Stop.COAST. I changed to Stop.HOLD and then ran numerous times with sound files playing and did not see any "wandering" events anymore ;0) The positions between moves also look more accurate. I guess this is because the motors are constantly being controlled even between moves now. Is that correct? The plot below, and others I made on new data logs (A4/D4 files), seem to indicate that is the case. So even if there is a sound file interruption it is quickly corrected it seems. I did need to have the threads stop any moves while saving log files and restartng logging or I would see sudden motor moves at that time. This makes sense. Is there any down side by using HOLD mode? Thanks for the help and education on LEGO motor operations ;0) Stay safe!

image

laurensvalk commented 4 years ago

The positions between moves also look more accurate. I guess this is because the motors are constantly being controlled even between moves now. Is that correct?

Correct!

The plot below, and others I made on new data logs (A4/D4 files), seem to indicate that is the case. So even if there is a sound file interruption it is quickly corrected it seems.

Thanks a lot for testing. Indeed, in those cases, a 60 ms time spike / hickup is just fine if there is feedback control after that.

And it was even fine in most cases in your previous tests also. But it went wrong if the spike occured just when the motion was beginning or ending.

Still, you helped us find an issue about brake not working as expected, so we will work to improve that. Thanks!

I did need to have the threads stop any moves while saving log files and restartng logging or I would see sudden motor moves at that time. This makes sense.

Ah, yes. Saving the log could probably be improved too - it is very much an internal feature though, to help us investigate this kind of thing. But improving this makes sense, thanks for pointing it out.

Is there any down side by using HOLD mode?

Not really. It is also the default in LEGO's graphical languages. It does use a bit more battery, since the motor is being actively controlled.

In low-load applications like yours, this should not make a major difference, but it can make a bigger difference in applications like robot arms carrying a heavy weight.

laurensvalk commented 4 years ago

Now for extra fun/challenge, you can look at track_target and make it follow a sin pattern as a function of time :smile:

johnscary-ev3 commented 4 years ago

Great idea! I wrote a function using track_target and sine function. The plots below show the results I got. I did 200 steps for each cycle and took Sonic distance readings every 20 steps. In the Count_ref plot for Motor A it basically shows what I programmed with the small delays every 20 steps. The Count plot shows what I got. Seems like more jumps in the Count plot than what I expected. Am I going too fast or any ideas to smooth it out some? Thanks,

image

image

laurensvalk commented 4 years ago

I did 200 steps for each cycle and took Sonic distance readings every 20 steps. In the Count_ref plot for Motor A it basically shows what I programmed with the small delays every 20 steps.

How did you compute the sine value? Similar to this:

sin(time_now*scale_factor)

or more like this?

sin(loop_counter*scale_factor)

The count_ref should be smoother with the first option, as the EV3 is not a real-time system. I'm asking because your description sounds more like the second option.

One way yo get the time is by doing watch = StopWatch() at the beginning of your program and then reading watch.time().

Unless we are still seeing threading effects, of course. Does it also look like this without the sounds in parallel?

laurensvalk commented 4 years ago

Unfortunately I was able to capture another of these "wander" events with new rev. See below. This was running some sound files while moving motors.

Is this also reproducible with the "reduced" version? I'm hoping to fix it, but I'm not sure how to reproduce the problem.

johnscary-ev3 commented 4 years ago

Yes all the data log files came from the reduced EV3R_SCAN program. The current version on GitHub has Stop.HOLD for the target angle moves but I had COAST when I saw the "wandering". That is in the motion_tool.py file. Also you need to enable the sound files playing using Chan4 Beacon button. Chan4 Left_up button will stop the program.

I am still working on the "time based" Sine scan motion. I was using an index counter as you mentioned above. To make the motion smoother I can't really take a "time out" to do the sonic sensor reads anymore, so I and going to just do them while head is moving and see how that goes.

johnscary-ev3 commented 4 years ago

Hi, I updated my object scan thread to use time base and got data below. The count_ref looks like smooth sine function now but count curve is still a bit bumpy. Note I am doing two motors at same time. Maybe that has an effect? The cycle time is 3000msec and every 300msec I allow a sonic sensor read to happen. Should I expect any better tracking by the motor? No sound files happening on this run, but the main loop is still running and reading the IR sensor, checking for a button press.

image

laurensvalk commented 4 years ago

I tried a basic test as well (without sensors or threads), and I'm getting a smoother graph.

In your code, is something happening approximately 6-7 times per second? the interruptions look very periodic.

Can you share the log file that you used to produce that graph?

johnscary-ev3 commented 4 years ago

Hi, This is the best I could get by basically eliminating everything except the thread doing the sine function moves on one motor. The main loop is still running but not doing anything except 3000msec waits. The main loop IR sensor reads may have been those 6-7 times per second activities but I turned those off. Also no sonic sensor read or sound files running. Not too bad, but a little bumpy at the extreme angles. How does this compare to your test?

This is from logM1_2 sheet in this workbook with my Sine fuction test results: https://github.com/johnscary-ev3/Data_logs/blob/master/log_motor_sine_tests.xlsx

image

laurensvalk commented 4 years ago

Is this a medium motor? Then it looks about right.

Near the extreme angles, the speed is nearly 0, and friction plays a big role.

johnscary-ev3 commented 4 years ago

Yes it is medium motor.

Thanks for all the help and training about the motors. I am really amazed by what we can do with this LEGO hardware and your Python software. Keep up the great work!

laurensvalk commented 4 years ago

Thanks for all the help in investigating this! It seems then that most of the questions here are resolved.

As a final question, did you use the Ultrasonic Sensor in silent=True mode or just regular mode?

The "wandering" phenomenon is still an issue, which I've moved to a dedicated issue in #24 .