Closed LexiconCode closed 1 year ago
@Danesprite
It works as expected if natspeak.exe
is killed first. The error happens if there two "natspeak.exe" running!
PS C:\Windows\system32> taskkill /IM natspeak.exe /f /s localhost
SUCCESS: The process "natspeak.exe" with PID 16108 has been terminated.
SUCCESS: The process "natspeak.exe" with PID 8744 has been terminated.
Thanks for opening this issue.
I don't understand why there would be two natspeak.exe processes running. Perhaps this type of error is why a call to Py_Finalize
was omitted from CDgnAppSupport::UnRegister
originally. Have you tried reverting #129, rebuilding natlink and testing again?
So testing with out https://github.com/dictation-toolbox/natlink/pull/129
On second run the loader never gets past Starting natlink loader from config file: "C:\Users\MainLaptop\.natlink\natlink.ini"
to load _test_hud
How ever there is only one natspeak.exe
process.
On second run Natlink in debug mode skipping unchanged loaded module: _test_hud
The process "natspeak.exe" does not end after closing dns even waiting 3 minutes. Even though there's no error I think this is somehow the root cause. Somehow Python is left in a partial initialized state between DNS restarts. Possibly the loader is still initialized from the last start and therefore skipping unchanged loaded module: _test_hud
Another observation is on second load natlink loads way too fast compared to if the process is killed first.
Natlink Version: 5.3.4
Natlink pyd path: C:\Program Files (x86)\Natlink\site-packages\natlink\_natlink_core13.pyd
Use DebugView to debug natlink problems.
https://docs.microsoft.com/en-us/sysinternals/downloads/debugview
Python Version: 3.10.7 (tags/v3.10.7:6cc6b13, Sep 5 2022, 13:51:36) [MSC v.1933 32 bit (Intel)]
Natlink is loaded...
set log level to: DEBUG
Starting natlink loader from config file:
"C:\Users\MainLaptop\.natlink\natlink.ini"
get_user_language, return userLanguage: "enx", (long language: "US English")
set_user_language, user: "MainLaptop", profile: "C:\ProgramData\Nuance\NaturallySpeaking13\Users\MainLaptop\current", language: "enx"
triggering load/reload process
skipping unchanged loaded module: _test_hud
Well that is strange. Yes, it appears that the process is being reused and the interpreter ends up in an odd state the second time round.
Does this behaviour only occur when multiprocessing
is used? And does your spawned process exit when DNS is shutdown? If it is specific to multiprocessing
, then I think you should probably use threads and subprocess
instead.
Yes, the sub process does disappear when DNS closes. @Danesprite
I can replicate the issue by loading an empty file so this is unrelated to multiprocessing.
All right then. I am not able to replicate these errors since I can't install natlinkcore on Python 3.8. If they are a major problem for you, the call to Py_Finalize
could be removed; the behaviour that occurs without it seems more acceptable.
@Danesprite I don't have time to compile something for 3.8 at the moment. However, finalize just made the issue stand out, not including it will only ask for future issues.
The root cause really needs to be solved as this is going to continue showing up in very weird issues and they're always going to be hard to diagnose when natlink is released abroad. Which DNS version are you using?
@Danesprite It happens on python 3.8 as well for me.
py -3.8 -m pip install dtactions-1.5.6-py3-none-any.whl
dtactions-1.5.6-py3-none-any.zip
as admin py -3.8 -m pip install natlinkcore-5.3.5-py3-none-any.whl
natlinkcore-5.3.5-py3-none-any.zip
install natlink5.3.5-py3.8-32-setup.zip
natlink5.3.5-py3.8-32-setup.zip
Yes, I suppose this could cause further problems. I assume that this behaviour does not occur on Natlink version 4.2, before the recent changes to the C++ code. If that is so, and if I am able to reproduce the above errors, git-bisect
may be used to track down the root cause. I could give this try.
To answer your question, I am using DNS version 15.61.200.010 (DPI).
Thank you for the zip files. I had compiled Natlink for 3.8, but got an error when the installer tried to install natlinkcore via pip.
I am unable to reproduce the above errors. Natlink behaves normally using the files provided. For the test I used a Natlink user directory containing only the modules in test.zip. As you mention above, the executable path in the _test_hud.py
file had to be changed.
I am unable to reproduce the above errors. Natlink behaves normally using the files provided. For the test I used a Natlink user directory containing only the modules in test.zip. As you mention above, the executable path in the
_test_hud.py
file had to be changed.
I'm using DNS 13 on windows 10.
When I get a chance I will test with Windows 11 and DNS 15
Okay then. Unfortunately, I can't test with DNS 13. I don't recall encountering this behaviour when I used to use that version. That was many years ago, however.
When I get a chance I will test with Windows 11 and DNS 15
I cannot replicate on dns 15
I cannot replicate on dns 15
Hmm. What about with Natlink 4.2?
@Danesprite no, I can't reproduce on Python 2.7
Is that with DNS 13 or 15?
@Danesprite DNS 13
Okay. It seems, then, that this is a regression in Natlink, or in some other part of the system. You could try using git-bisect
on the natlink/natlinkcore repository to track it down.
Hey, since the update to natlink 5.4.4 I have the same issue.
DNS 15.3 Python 3.10.0 Windows 10
I can no longer say exactly which version I had previously installed. But it was one with Python 3.8 support. With this older version, DNS could be closed normally.
@tripfish
Can you replicate this with just a dragonfly script? If you're using caster, you can modify the bat file to dragonfly sample script instead of caster.py or *.py
I don't use caster. Only Dragonfly with Breathe. However, I noticed that I am loading an XMLRPCServer (SimpleThreadXMLRPCServer) with a file. As soon as I stop loading this file with NatLink (_ removed), Dragon can exit normally.
Do you think you could upload the relevant files to reproduce?
Have any impact on the issue?
Here is the file.
I haven't tried the fix yet. I can try in the next few days. _xmlrpc-test-server.zip
apply_threading_fix() doesn't change anything. In this file, dragonfly is not loaded at all. But even if I load it and use apply_threading_fix(), it has no effect. However, in this file I use threading to create a background thread. Maybe that's the cause?
I believe he needs to be a Damon thread
Yes that's it. If I set daemon = True, I can exit DNS normally again.
But why did this work in an older natlink version without daemon?
Using Natlink I've had to use Damon threads regardless of python. 2/3. Otherwise, it's not always obvious when the process freezes because the main process is in use.
The Natlink process does not close when Dragon is closed. Therefore when Dragon starts again the Natlink process is reused. Therefore the Python environment is already initialized and put in a bad state because it was not cleanly exited with the closing of Natlink process.
See traceback without dragonfly imports!
I've also tested this with an empty file _test.py. In this case the natlink loader is still initialized and _test_hud.py` was not loaded. See https://github.com/dictation-toolbox/natlink/issues/141#issuecomment-1272667327
Minimal example of the issue test.zip To reproduce, I've included files that could be loaded by dragon.
Edit
get context
to yourwpython
exe in_test_hud.py
Launch dragon with the files in the user dir for dragonfly
Close dragon through the tray icon
Relaunch dragon and see error
Note: If the process is killed
taskkill /IM natspeak.exe /f /s localhost
and DNS is restarted everything behaves as expected.I'm using DNS 13 on windows 10.
Without dragonfly imports in test code.
Danesprite "It looks like these errors can occur because of very late finalization, after things like KeyError and len have been deallocated. This doesn't explain why they occur after start-up.
Another explanation is that some C/C++ code is calling a Python API function and ignoring a return value of NULL, which indicates an exception. Later on, some other C/C++ code calls PyErr_Occurred and produces a nonsensical traceback."