Open marko-pi opened 4 years ago
To run multiple threads beating on the same hardware it is safer to use the Python interface and share the pigpio instance between the threads.
@guymcswain Honestly, I don't understand what you suggested.
My workaround was to check parameters instead of status. Since I used all parameters to the full 32bits, Python script remembered the value of the first parameter before starting the script and then waited until script increased its value for 1 at its very end. Messy. But at least I have no problems since.
My point is: either script status is repaired, or it is removed. It is simply misleading/not reliable. I was wondering for months why is my program sometimes not working properly. Only after I used signal analyzer/oscilloscope I finally found the problem.
My point is to not use multithreading in this manner. If you can demonstrate the issue on a single script thread then it is likely a problem with the API.
I have demonstrated this on single script. The number 255 is because of the single script.
Your example given above has two scripts: a and b. What am I missing?
First I wanted to show on two scripts. And then you get those 1s instead of 0s. But then, completely unintentionally, 255s also appeared. 255s are because of the second script only. It is all explained in my first post.
Also note, that I am checking the status of the scripts and do not allow to proceed with next script until the current script finishes.
Ok, I see you are running them serially.
Essentially, you could delete the first script and run only the second one serially and you would still get spurious 255s. I left it as it is because it is a double proof of my bug report, and also arguably easier to understand.
I ran your python script on a RPI0W and all output is 0 (low). Is this issue only on RPI3?
I'm running V78 but is shouldn't matter - the script APIs haven't changed in a long time.
Actually, now after you asked, I did detect this problem on RPi Zero W much much more rarely. My instant speculation is that Zero is slower, so Python script gives pigpio
script enough time to actually start and change PI_SCRIPT_HALTED
status. Perhaps I could possibly optimize times in the script (mils 1
, mils 10
, mils 100
) to get more issues on Zero too, just as I optimized times RPi 3. But in the moment I have no Zero at hand.
Ok, that may hint at some kind of interaction between the two scripts. Can you insert gpio.stop_script(scriptx)
statements after you detect the script has halted and run on your setup.
I don't quite understand what you suggest.
But to prove my point that this is not due to the interaction between two scripts, I simply commented out scripta
completely, and I am still getting 255s, just as I expected. It seems that these spurious things usually happen when RPi is busy doing other things, possibly delaying starting of scripts.
I just ran your original program, unaltered, on a RPI3B+ with no failure.
What version of pigpio are you using and how did you install it?
Not sure. I think I installed that years ago. But I regularly do updates.
How can I see the version of pigpio?
I started program from command prompt. I get one or two 1s or 255s, so it is better than starting from Geanny.
If I close Geanny completely, then I get one "1" every five tries.
From terminal you should be able to do pigs pigpv
will return version.
You should download a recent version, recompile and install. Then run the suite of tests that come in the zip file. If these don't pass there is something wrong/incompatible with your environment.
I started program and a moment after that I started Chromium. I got about 10 255s and 1 1.
So if you overload Raspberry with other programs the errors multiply considerably.
Try start my program and overload Raspberry with something, starting big program or something.
pigs pigpv returns 71
I'm not looking into this any further until you can run the pigpio test successfully.
OK how I test pigpio?
There are several. Start with this one:
wget https://github.com/joan2937/pigpio/blob/master/x_pigpio.py
./x_pigpio.py
I don't know if this is OK with you but I apt-get purge pigpio apt-get install pigpio
and I have again version 1.71
I have killed pigpiod restarted pigpid
run program again and just second after starting I also started Chromium to overload Raspberry and I got 5-6 1s.
Download and install latest version
wget https://github.com/joan2937/pigpio/archive/master.zip
unzip master.zip
cd pigpio-master
make
sudo make install
If the Python part of the install fails it may be because you need the setup tools.
sudo apt install python-setuptools python3-setuptools
Run a test from the same directory:
sudo pigpiod
./x_pigpio.py
I did what you said, it went smoothly, but somehow it is still version 71 (pigpiod -v)
I killed pigpiod and I restarted it from pigpio-master directory and I also moved my bugger.py into pigpio-master directory and started it from there.
Your path may be picking up the older version. You'll need to resolve that somehow.
Hmmm sudo pigpiod -v returns 78
and I start
sudo ./bugger.py
and I still get 1s and 255s.
I also always started pigpiod with sudo, so perhaps I was using ver. 78 all along.
(you actually cannot start pigpiod without sudo)
And the result of pigpio test?
you mean test as running my program?
sudo ./bugger.py immediately starting Chromium to overload Pi i get bunch of 1s and 255s
without overloading Pi I get 0s only
No. Run the x_pigpio.py test from the downloaded files.
PASS PASS PASS... everything PASS five screens of PASS-es overall
With chromium running?
I was constantly opening and closing Chromium to keep RPi busy and it was still all PASS.
OK I have to go for today.
Maybe it would be helpful to repeat that I get extremely rare 1s and 255s if Pi is doing nothing else and program is started from prompt. If I overload Pi with opening and closing Chromium I get bunches of 1s and 255s.
Ok, that's good. So can you demonstrate the issue without having to run a desktop environment and a chromium browser. I have little inclination to investigate otherwise.
Could you just confirm that you see the error under the conditions described? So we make sure that it isn't just some astronomical odds that I possess corrupted RPis?
I'd rather not try to duplicate your conditions as they don't seem very conventional. I did run your script in a loop overnight on RPI0W along with another program that attempted to thrash the network somewhat in hopes of putting more load on the single core host. The loop is still running without error.
That said, you may have found some corner case causing the pigpio script to fail. Given that, I would recommend finding another way to achieve your goals. As I stated in my first comment, I recommend you to use threads on Python and share the single pigpio instance.
More conventional situation and my original situation is running Flask as a web server and controlling LCD display via pigpio scripts -- that is, essentially controlling LCD display over web. In that case RPi is has a decent workload apart from that of the scripts and thus above mentioned conditions appear naturally.
Of course these conditions would be even much more difficult for you to replicate, which is why I wrote this simple test program. I could perhaps mimic workload by including some heavy calculation to my test program, but I guess this would not satisfy your strict testing conditions.
Whether my original situation is some corner case is of course strongly arguable.
My recommendation stands even if you convince me that your use case is legitimate - I didn't mean to imply it wasn't.
Well as I already pointed out, I found out a workaround. I am signalling through parameters - messy but it works and haven't had single instance of error since.
But someone might come into my situation in future - losing months (not continuously of course) to figure out what the hell is wrong. On the other hand, I must give you some credit about the "corner case" as I don't see a lot of programmers using serious scripting as it requires Assembly knowledge - which I fortunately learned on ZX Spectrum in 1980s.
You may want to read through the findings of issue #357 . I think your situation* is similar. There is no plan to rewrite the underlying code to address this situation when a single instance Python interface resolves it. This needs to be documented, however, and that's on me.
* Python is one thread, each pigpio script you invoke is another thread.
This issue exposes a race between the OS scheduling a thread ('the script') to run and the network latency in polling for the script's status. Because script's run state is very short lived it is possible to never catch it in the running state by polling for the status.
Furthermore, polling for a script's status in a tight loop is a waste of network bandwidth. Therefore, I believe a better approach is needed in such situations.
As workarounds I would offer several solutions: 1) Use a parameter in the script to flag the fact that the script has run to completion. 2) Trigger an event at the completion of the script.
I prefer the use of an event which can be monitored asynchronously without wasting network bandwidth. I'm considering building a PI_SCRIPT_HALTED_EVENT
into a future release of pigpio to make all of this more user friendly.
Consider this program
Note that
mils 1
in the end of scripts is just to make sure that pin change/writing to parameters has taken place.The idea is that script A sets pin 18 to High, waits 100 ms and sets pin 18 back to Low. I am spying on pin 18 with script B (pins are connected).
Also note, that I am checking the status of the scripts and do not allow to proceed with next script until the current script finishes.
Obviously, the result of spying should be 100% Low (0). Well in few percent cases I get High (1)!
I first noted this problem using much more complex scripts. I found with oscilloscope out that sometimes it takes several ms after
run_script
that script A actually starts and during that time status is stillPI_SCRIPT_HALTED
. So in this time script B starts and (with a little "luck") checks pin 18 right after the first script sets pin 18 to High.When doing the tests I realised that sometimes I get 255. This is completely unintended additional proof for my narrative: script B takes several ms to start, in meantime it is still
PI_SCRIPT_HALTED
, program proceeds, and sees that parameter has not (yet) been changed.I used Raspberry Pi 3.