Closed JDoucette closed 1 year ago
This also occurs in text mode.
5 SCREEN 0,0,0:WIDTH 80:CLS
10 I!=I!+.1:J#=J#+.1#:LOCATE 1,1:PRINT I!;J#;(I!-J#):GOTO 10
Simple program that adds 0.1 repeatedly and draws the results. The screen should be updating blazingly fast. But it stalls, and updates randomly.
p.s. Unrelated to the bug: This reminiscent of a very old program I made as a kid where I was amazed that adding 0.1 over and over again would not result in accurate results, and furthermore, you could watch the velocity of the accumulative error, and the acceleration changes as it would jump into positive flow, then negative flow.
Thanks for reporting, fixed with issue 8d1ae6f0. The fix is to call time.sleep(0)
twice. I have no idea why this works or more precisely, why it is necessary.
The issue is still present when using more than one core due to, I think, python's GIL. If you set the affinity to one or two cpu the bug is gone and the screen updates fast without the A$=INKEY$ workaround. But if you leave default (ALL) CPU affinity, the update is "choppy" and have to use A$.
i76700 - 1 CPU, 4 CORES, 8 THREADS Win 11 - pcbasic 2.0.6
I'm sorry @theruler can you please explain what you're doing? What is "affinity"? I do not see this problem anymore on my machine (which is also multi-core) so would need more information to be able to reproduce.
Yes, sure. Tried on three modern multicore PC with windows (2x win10 and 1x win11): If I paste the code on PCBASIC 2.0.6 without the A$=INKEY$ workaround:
5 SCREEN 0,0,0:WIDTH 80:CLS
10 I!=I!+.1:J#=J#+.1#:LOCATE 1,1:PRINT I!;J#;(I!-J#):GOTO 10
I get the same results of jdoucette, the output is slow and stalls updating randomly.
To solve the issue OS side, you have to open Task Manager, "details" tab, find pcbasicw.exe process, click the right mouse button on it and select "process affinity". There you can allocate HW resurces (logic CPUs) for the process. If you set 1 CPU only, the problem is gone. On some PC you can set 2 logic CPU and still solve the issue and have a minor speed increase. 3 or more CPUs makes pcbasic "lag", I think due to Python's Global Interpreter Lock (GIL) feature.
Hope this was of help to you.
Regards Stefano
I did some timing tests with and without A$=INKEY$ and using single or multi cpu affinity. i5-1135G7@2.4GHz Windows 10 PC-BASIC 2.0.6
10 CLS:T=TIMER
20 FOR X= 1 TO 10000
30 I!=I!+.1:J#=J#+.1#:LOCATE 1,1:PRINT I!;J#;(I!-J#)
40 NEXT
50 ? TIMER-T
multi: 18 sec (choppy output) single: 35 sec (smooth output)
10 CLS:T=TIMER
20 FOR X= 1 TO 10000
30 I!=I!+.1:J#=J#+.1#:LOCATE 1,1:PRINT I!;J#;(I!-J#)
35 A$=INKEY$
40 NEXT
50 ? TIMER-T
multi: 87 sec (smooth output) single: 122 sec (smooth output)
Thanks @theruler for the metrics.
So this is actually a slightly different issue as the original issue occurred on single-core as well, and led to a complete block of screen updates. That has been resolved, however what we're seeing here is choppy updates on multi-core machines, at least under Windows.
You are probably right that this is related to the GIL. It looks similar to this: https://www.rfk.id.au/blog/entry/a-gil-adventure-threading2/ . The analysis in that case is that when the threads execute on different cores, they race to acquire the GIL and this leads to slower performance on multiple cores than on single core. In-depth explanation in David Beazley's slides here http://www.dabeaz.com/python/UnderstandingGIL.pdf
However:
The root cause of the issue I think is that (given the GIL in CPython) the display thread simply has too much CPU work to do to keep up with the engine thread, which unfortunately takes rather big changes to address and is something I'd defer until we have better test coverage.
It may be possible to work around the issue by explicitly setting the affinity from the code - however for some reason os.sched_setaffinity
is only available on Unix so this likely requires linking this thing in kernel32.dll
: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setprocessaffinitymask
Further update
For reference, on Linux, these are "smooth":
python2 ./run-pcbasic.py update.bas
taskset -c 0 python3 ./run-pcbasic.py update.bas
while this is "choppy":
python3 ./run-pcbasic.py update.bas
Here, update.bas
is the two-line program above at https://github.com/robhagemans/pcbasic/issues/166#issuecomment-1225707565
I'm assuming you're running this on Python 3 - if it's the packaged Windows version it is Python 3.
correct.
I am not concerned about speed per se, what I think pc-basic should achieve is the same speed as an old 386, on all system independently. adding A$=INKEY$, surprisingly, make high computational program behave similarly on different machines.
Further tests OS side (windows 10). It's interesting how different behave python 3 and python 2. I've used the same program here Single CPU affinity, with A$=INKEY$ (first half of the chart) and without (second half). the darkest area is Windows KERNEL TIME.
windist - PC-basic 2.0.6 - Python 3: ------- 122 sec ---- 35 sec
pure python PC-basic 2.0.6 - Python 3: (python3 ./run-pcbasic.py update.bas) ------- 167 sec ------ 12 sec Here is an interesting behaviour: the first part shows all kernel time and the execution is very slow, the worst performance actually. In the second part, the one without the A$=INKEY$ workaround, as you noticed the first 7 seconds kernel time is not present, then it kicks in for the rest of the run. The program in the first part is VERY slow, then it speeds up to full speed, exactly when kernel time kicks in.
PC-basic 1.2.15 - Python 2: --------- 55 sec ------- 12 sec
The kernel is that part of the operating system that acts on behalf of all the applications and services running on the system. The kernel is the only part that of the operating system that has access to the physical resources of the computer. Access to the kernel is handled through very specific, highly privileged, API calls.
The most common culprit of run-away kernel time is a poorly written device driver. Another could be a failing hardware device that is causing a flood of hardware interrupts - hardware interrupts are handled only by the kernel.
In a system that is not really running any applications, the kernel continues to perform its functions, which includes all sorts of 'house-keeping' tasks to ensure the system is running correctly. Therefore, if no user processes are running, kernel time will be a higher percentage of overall usage. But as the number of user processes kick in, you will see the percentage of kernel versus user time go down.
Lastly, if you have to ask what kernel time is, you are most likely not the person to 'debug' it. There are very few people who work in kernel development and troubleshooting.
You've fixed it! with your changes single and multi cpu speed are almost identical and the CPU load is back to normal. multi 24.39 sec single 23.75 sec
The output by the way features pseudo-random accelerations. You can tell looking at the counter running.
I think it is worth paste your code comment in eventcycle.py:
# what I think is happening here is that the sdl2 interface thread,
# in its loop to process a single queue item, calls C library functions
# which do not need the GIL. so it releases it. this allows the engine thread to pick up
# and produce more work for the interface thread. sleep(0) does not wait but it does release
# the GIL back, so we need a few sleep(0) calls to allow the interface to get through
# individual items.
# this would not be a problem if the interface thread did not need the GIL at all
# (perhaps with numba, nuitka, cython, pypy or jython)
# but note that the video queue is a Python object so may require the GIL
# or if it held the GIL for a full cycle
# wait to for the queue to drain if it excceds a threshold value
# this allows the interface to catch up with video updates
I've tried to run the code deleting the lib folder where SDL2.DLL and SDL2_gfx.DLL are, and it run anyways. Does this mean that SDL is not loaded or pcbasic is just using SDL in other PATH folder?
here are the warnings:
c:\Intel\pcbasic-master>python run-pcbasic.py
[09:49:05.0646] WARNING: Failed to load library sdl2: could not find any library for SDL2 (looked in paths: ('C:\Intel\pcbasic-master\pcbasic\lib',))
[09:49:05.0646] WARNING: Failed to load library sdl2_gfx: could not find any library for SDL2_gfx (looked in paths: ('C:\Intel\pcbasic-master\pcbasic\lib',))
[09:49:05.0646] INFO: Could not initialise video plugin sdl2
: Module sdl2
not found
[09:49:05.0646] WARNING: The pygame
interface is deprecated, please use the graphical
interface instead.
[09:49:05.0646] INFO: Could not initialise video plugin pygame
: Module pygame
not found
[09:49:05.0662] WARNING: Failed to load library sdl2: could not find any library for SDL2 (looked in paths: ('C:\Intel\pcbasic-master\pcbasic\lib',))
[09:49:05.0662] WARNING: Failed to load library sdl2_gfx: could not find any library for SDL2_gfx (looked in paths: ('C:\Intel\pcbasic-master\pcbasic\lib',))
[09:49:05.0662] INFO: Could not initialise audio plugin ansi
: Module sdl2
not found
By the looks of that log, PC-BASIC would have ran in text mode without using SDL2. You'd notice though, it looks rather different, running in the command prompt window rather than opening a window of its own.
@robhagemans do you have any eta of a new version that would include this fix?
Hi, I'm afraid I can't give an ETA, this is purely a project I work on in spare time.
I've been meaning to spend a bit of time on it for a while but have been busy with other projects and of course the day job
On Mon, 13 Mar 2023, 14:57 Stefano Crespi, @.***> wrote:
@robhagemans https://github.com/robhagemans do you have any eta of a new version that would include this fix?
— Reply to this email directly, view it on GitHub https://github.com/robhagemans/pcbasic/issues/166#issuecomment-1466308587, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB25RESINSAATY2OUBHOOOTW34YVDANCNFSM5H7IUQWA . You are receiving this because you were mentioned.Message ID: @.***>
Bug report
See: https://www.youtube.com/watch?v=9J9DtrcoDXc
Problem
Steps
Program
Notes
PC-BASIC version: 2.0.4 Operating system version: Windows 10