joan2937 / pigpio

pigpio is a C library for the Raspberry which allows control of the General Purpose Input Outputs (GPIO).
The Unlicense
1.46k stars 410 forks source link

Script status not correct #392

Open marko-pi opened 4 years ago

marko-pi commented 4 years ago

Consider this program

import pigpio
gpio=pigpio.pi()

# pins 18 and 24 are connected, pin 24 reads pin 18
gpio.set_mode(24, pigpio.INPUT)
gpio.set_mode(18, pigpio.OUTPUT)

scripta = gpio.store_script(b'w 18 1 mils 100 w 18 0 mils 1') 
scriptb = gpio.store_script(b'mils 10 r 24 sta p0 mils 1')

tot=[]
for i in range (200):
    gpio.run_script(scripta, [])
    while True:
        a, _ = gpio.script_status(scripta)
        if a==pigpio.PI_SCRIPT_HALTED: break
    gpio.run_script(scriptb, [255])
    while True:
        b, c = gpio.script_status(scriptb)
        if b==pigpio.PI_SCRIPT_HALTED: break
    tot.append(c[0])
print(tot)

gpio.delete_script(scripta)
gpio.delete_script(scriptb)

Note that mils 1 in the end of scripts is just to make sure that pin change/writing to parameters has taken place.

The idea is that script A sets pin 18 to High, waits 100 ms and sets pin 18 back to Low. I am spying on pin 18 with script B (pins are connected).

Also note, that I am checking the status of the scripts and do not allow to proceed with next script until the current script finishes.

Obviously, the result of spying should be 100% Low (0). Well in few percent cases I get High (1)!

I first noted this problem using much more complex scripts. I found with oscilloscope out that sometimes it takes several ms after run_script that script A actually starts and during that time status is still PI_SCRIPT_HALTED. So in this time script B starts and (with a little "luck") checks pin 18 right after the first script sets pin 18 to High.

When doing the tests I realised that sometimes I get 255. This is completely unintended additional proof for my narrative: script B takes several ms to start, in meantime it is still PI_SCRIPT_HALTED, program proceeds, and sees that parameter has not (yet) been changed.

I used Raspberry Pi 3.

Untitled

guymcswain commented 4 years ago

To run multiple threads beating on the same hardware it is safer to use the Python interface and share the pigpio instance between the threads.

marko-pi commented 4 years ago

@guymcswain Honestly, I don't understand what you suggested.

My workaround was to check parameters instead of status. Since I used all parameters to the full 32bits, Python script remembered the value of the first parameter before starting the script and then waited until script increased its value for 1 at its very end. Messy. But at least I have no problems since.

My point is: either script status is repaired, or it is removed. It is simply misleading/not reliable. I was wondering for months why is my program sometimes not working properly. Only after I used signal analyzer/oscilloscope I finally found the problem.

guymcswain commented 4 years ago

My point is to not use multithreading in this manner. If you can demonstrate the issue on a single script thread then it is likely a problem with the API.

marko-pi commented 4 years ago

I have demonstrated this on single script. The number 255 is because of the single script.

guymcswain commented 4 years ago

Your example given above has two scripts: a and b. What am I missing?

marko-pi commented 4 years ago

First I wanted to show on two scripts. And then you get those 1s instead of 0s. But then, completely unintentionally, 255s also appeared. 255s are because of the second script only. It is all explained in my first post.

guymcswain commented 4 years ago

Also note, that I am checking the status of the scripts and do not allow to proceed with next script until the current script finishes.

Ok, I see you are running them serially.

marko-pi commented 4 years ago

Essentially, you could delete the first script and run only the second one serially and you would still get spurious 255s. I left it as it is because it is a double proof of my bug report, and also arguably easier to understand.

guymcswain commented 4 years ago

I ran your python script on a RPI0W and all output is 0 (low). Is this issue only on RPI3?

guymcswain commented 4 years ago

I'm running V78 but is shouldn't matter - the script APIs haven't changed in a long time.

marko-pi commented 4 years ago

Actually, now after you asked, I did detect this problem on RPi Zero W much much more rarely. My instant speculation is that Zero is slower, so Python script gives pigpio script enough time to actually start and change PI_SCRIPT_HALTED status. Perhaps I could possibly optimize times in the script (mils 1, mils 10, mils 100) to get more issues on Zero too, just as I optimized times RPi 3. But in the moment I have no Zero at hand.

guymcswain commented 4 years ago

Ok, that may hint at some kind of interaction between the two scripts. Can you insert gpio.stop_script(scriptx) statements after you detect the script has halted and run on your setup.

marko-pi commented 4 years ago

I don't quite understand what you suggest.

But to prove my point that this is not due to the interaction between two scripts, I simply commented out scripta completely, and I am still getting 255s, just as I expected. It seems that these spurious things usually happen when RPi is busy doing other things, possibly delaying starting of scripts.

Untitled

guymcswain commented 4 years ago

I just ran your original program, unaltered, on a RPI3B+ with no failure.

guymcswain commented 4 years ago

What version of pigpio are you using and how did you install it?

marko-pi commented 4 years ago

Not sure. I think I installed that years ago. But I regularly do updates.

How can I see the version of pigpio?

I started program from command prompt. I get one or two 1s or 255s, so it is better than starting from Geanny.

If I close Geanny completely, then I get one "1" every five tries.

guymcswain commented 4 years ago

From terminal you should be able to do pigs pigpv will return version.

guymcswain commented 4 years ago

You should download a recent version, recompile and install. Then run the suite of tests that come in the zip file. If these don't pass there is something wrong/incompatible with your environment.

marko-pi commented 4 years ago

I started program and a moment after that I started Chromium. I got about 10 255s and 1 1.

So if you overload Raspberry with other programs the errors multiply considerably.

Try start my program and overload Raspberry with something, starting big program or something.

marko-pi commented 4 years ago

pigs pigpv returns 71

guymcswain commented 4 years ago

I'm not looking into this any further until you can run the pigpio test successfully.

marko-pi commented 4 years ago

OK how I test pigpio?

guymcswain commented 4 years ago

There are several. Start with this one:

wget https://github.com/joan2937/pigpio/blob/master/x_pigpio.py
./x_pigpio.py
marko-pi commented 4 years ago

I don't know if this is OK with you but I apt-get purge pigpio apt-get install pigpio

and I have again version 1.71

I have killed pigpiod restarted pigpid

run program again and just second after starting I also started Chromium to overload Raspberry and I got 5-6 1s. Untitled

guymcswain commented 4 years ago

Download and install latest version

wget https://github.com/joan2937/pigpio/archive/master.zip
unzip master.zip
cd pigpio-master
make
sudo make install

If the Python part of the install fails it may be because you need the setup tools.

sudo apt install python-setuptools python3-setuptools

Run a test from the same directory:

sudo pigpiod
./x_pigpio.py
marko-pi commented 4 years ago

I did what you said, it went smoothly, but somehow it is still version 71 (pigpiod -v)

I killed pigpiod and I restarted it from pigpio-master directory and I also moved my bugger.py into pigpio-master directory and started it from there.

guymcswain commented 4 years ago

Your path may be picking up the older version. You'll need to resolve that somehow.

marko-pi commented 4 years ago

Hmmm sudo pigpiod -v returns 78

and I start

sudo ./bugger.py

and I still get 1s and 255s.

I also always started pigpiod with sudo, so perhaps I was using ver. 78 all along.

(you actually cannot start pigpiod without sudo)

guymcswain commented 4 years ago

And the result of pigpio test?

marko-pi commented 4 years ago

you mean test as running my program?

sudo ./bugger.py immediately starting Chromium to overload Pi i get bunch of 1s and 255s

without overloading Pi I get 0s only

guymcswain commented 4 years ago

No. Run the x_pigpio.py test from the downloaded files.

marko-pi commented 4 years ago

PASS PASS PASS... everything PASS five screens of PASS-es overall

guymcswain commented 4 years ago

With chromium running?

marko-pi commented 4 years ago

I was constantly opening and closing Chromium to keep RPi busy and it was still all PASS.

marko-pi commented 4 years ago

OK I have to go for today.

Maybe it would be helpful to repeat that I get extremely rare 1s and 255s if Pi is doing nothing else and program is started from prompt. If I overload Pi with opening and closing Chromium I get bunches of 1s and 255s.

guymcswain commented 4 years ago

Ok, that's good. So can you demonstrate the issue without having to run a desktop environment and a chromium browser. I have little inclination to investigate otherwise.

marko-pi commented 4 years ago

Could you just confirm that you see the error under the conditions described? So we make sure that it isn't just some astronomical odds that I possess corrupted RPis?

guymcswain commented 4 years ago

I'd rather not try to duplicate your conditions as they don't seem very conventional. I did run your script in a loop overnight on RPI0W along with another program that attempted to thrash the network somewhat in hopes of putting more load on the single core host. The loop is still running without error.

That said, you may have found some corner case causing the pigpio script to fail. Given that, I would recommend finding another way to achieve your goals. As I stated in my first comment, I recommend you to use threads on Python and share the single pigpio instance.

marko-pi commented 4 years ago

More conventional situation and my original situation is running Flask as a web server and controlling LCD display via pigpio scripts -- that is, essentially controlling LCD display over web. In that case RPi is has a decent workload apart from that of the scripts and thus above mentioned conditions appear naturally.

Of course these conditions would be even much more difficult for you to replicate, which is why I wrote this simple test program. I could perhaps mimic workload by including some heavy calculation to my test program, but I guess this would not satisfy your strict testing conditions.

Whether my original situation is some corner case is of course strongly arguable.

guymcswain commented 4 years ago

My recommendation stands even if you convince me that your use case is legitimate - I didn't mean to imply it wasn't.

marko-pi commented 4 years ago

Well as I already pointed out, I found out a workaround. I am signalling through parameters - messy but it works and haven't had single instance of error since.

But someone might come into my situation in future - losing months (not continuously of course) to figure out what the hell is wrong. On the other hand, I must give you some credit about the "corner case" as I don't see a lot of programmers using serious scripting as it requires Assembly knowledge - which I fortunately learned on ZX Spectrum in 1980s.

guymcswain commented 4 years ago

You may want to read through the findings of issue #357 . I think your situation* is similar. There is no plan to rewrite the underlying code to address this situation when a single instance Python interface resolves it. This needs to be documented, however, and that's on me.

* Python is one thread, each pigpio script you invoke is another thread.

guymcswain commented 2 years ago

This issue exposes a race between the OS scheduling a thread ('the script') to run and the network latency in polling for the script's status. Because script's run state is very short lived it is possible to never catch it in the running state by polling for the status.

Furthermore, polling for a script's status in a tight loop is a waste of network bandwidth. Therefore, I believe a better approach is needed in such situations.

As workarounds I would offer several solutions: 1) Use a parameter in the script to flag the fact that the script has run to completion. 2) Trigger an event at the completion of the script.

I prefer the use of an event which can be monitored asynchronously without wasting network bandwidth. I'm considering building a PI_SCRIPT_HALTED_EVENT into a future release of pigpio to make all of this more user friendly.