Closed nj-vs-vh closed 3 years ago
I'm close to abandoning any hope to figure out what's happening by myself: I have just changed log level in my extension from INDIGO_LOG_INFO
to INDIGO_LOG_TRACE
and error disappeared!
Curent output:
23:47:23.999559 Application: INDIGO Bus: property change request
23:47:23.999639 Application: 'CCD Imager Simulator'.'CONNECTION' SWITCH UNDEFINED Idle 2.0 0 UNDEFINED {
23:47:23.999668 Application: 'CONNECTED' = Off
23:47:23.999687 Application: 'DISCONNECTED' = On
23:47:23.999708 Application: }
23:47:23.999729 Application: INDIGO Bus: Change request - Device 'CCD Imager Simulator' token 0x0, Proprerty 'CONNECTION' token 0x0
23:47:23.999754 Application: INDIGO Bus: property update
23:47:23.999776 Application: 'CCD Imager Simulator'.'CONNECTION' SWITCH rw Busy 2.0 0 OneOfMany {
23:47:23.999795 Application: 'CONNECTED' = Off
23:47:23.999813 Application: 'DISCONNECTED' = On
23:47:23.999832 Application: }
23:47:23.999883 Application: indigo_ccd_simulator: 'CCD Imager Simulator' detached
23:47:23.999968 Application: timer #0 (of 2) used for 0s
23:47:24.000059 Application: timer #0 done
23:47:24.000129 Application: INDIGO Bus: property removal
23:47:24.000222 Application: 'CCD Imager Simulator'.'' TEXT ro Ok 2.0 0 {
23:47:24.000253 Application: }
23:47:24.000289 Application: indigo_ccd_simulator: 'CCD Imager Simulator (wheel)' detached
23:47:24.000342 Application: timer #1 done
23:47:24.000508 Application: INDIGO Bus: property removal
23:47:24.000593 Application: 'CCD Imager Simulator (wheel)'.'' TEXT ro Ok 2.0 0 {
23:47:24.000624 Application: }
23:47:24.000654 Application: indigo_ccd_simulator: 'CCD Imager Simulator (focuser)' detached
23:47:24.000698 Application: INDIGO Bus: property removal
23:47:24.000725 Application: 'CCD Imager Simulator (focuser)'.'' TEXT ro Ok 2.0 0 {
23:47:24.000766 Application: }
23:47:24.000793 Application: indigo_ccd_simulator: 'CCD Guider Simulator' detached
23:47:24.000825 Application: INDIGO Bus: property removal
23:47:24.000847 Application: 'CCD Guider Simulator'.'' TEXT ro Ok 2.0 0 {
23:47:24.000866 Application: }
23:47:24.000886 Application: indigo_ccd_simulator: 'CCD Guider Simulator (guider)' detached
23:47:24.000913 Application: INDIGO Bus: property removal
23:47:24.000935 Application: 'CCD Guider Simulator (guider)'.'' TEXT ro Ok 2.0 0 {
23:47:24.000953 Application: }
23:47:24.000974 Application: indigo_ccd_simulator: 'CCD Guider Simulator (AO)' detached
23:47:24.001001 Application: INDIGO Bus: property removal
23:47:24.001022 Application: 'CCD Guider Simulator (AO)'.'' TEXT ro Ok 2.0 0 {
23:47:24.001040 Application: }
23:47:24.001063 Application: indigo_ccd_simulator: 'DSLR Simulator' detached
23:47:24.001095 Application: INDIGO Bus: property removal
23:47:24.001116 Application: 'DSLR Simulator'.'' TEXT ro Ok 2.0 0 {
23:47:24.001134 Application: }
23:47:24.005057 Application: Driver indigo_ccd_simulator unloaded
23:47:24.005144 Application: detached from INDIGO bus...
This seems like a race in disconnect... We will investigate. Thank you for reporting.
@nj-vs-vh , were you taking exposure while unloading the driver?
@rumengb no, I've checked the logs, they go
13:20:39.495804 Application: Client upload in 0.006184s
13:20:39.495814 Application: INDIGO Bus: property update
13:20:39.495821 Application: 'CCD Imager Simulator'.'CCD_EXPOSURE' NUMBER rw Ok 2.0 0 {
13:20:39.495825 Application: 'EXPOSURE' = 0
13:20:39.495829 Application: }
13:20:39.495834 Application: timer callback: 0x7f37e7322650 finished
13:20:39.495838 Application: timer #0 done
file written!
Last line is printed from my Python code and right after it I call cleanup function that produces the same output as in the previous message.
I did my best to reproduce it, but I was unable... I loaded and unloaded the driver 100 times no issue. I even connected and started exposure, moved focuser and much more, then unloaded the driver. again no crash. Can you share more details how to reproduce it? Now I see that it is your application crashing... Maybe there is something wrong in your code, maybe python to C bindings?
@rumengb I think I have finally tracked down the problem, and it indeed has something to do with Python. Sometimes with no apparent reason C string stored inside Python extension module becomes corrupted, and it must be messing up INDIGO's execution. Right before the crash I get the following trace log:
23:38:24.788958 Application: INDIGO Bus: property removal
23:38:24.789005 Application: 'CCD Imager Simulator'.'' TEXT ro Ok 2.0 0 {
23:38:24.789026 Application: }
23:38:24.789062 Application: indigo_ccd_simulator: 'CCD Imager Simulator (wheel)' detached
23:38:24.789101 Application: INDIGO Bus: property removal
23:38:24.789147 Application: '�[k)�'.'CCD_INFO' NUMBER ro Ok 2.0 0 {
23:38:24.789167 Application: }
23:38:24.789206 Application: INDIGO Bus: property removal
Segmentation fault (core dumped)
I'm sorry I have wasted your time! When I figure it out I hope to provide a fully functional Python interface for your amazing framework though!
Wow, Are you doing python bindings for indigo? Many people asked for this but we are too busy to do it. We have many things in our TODO list. Thank you!
On Mon, Oct 12, 2020 at 6:47 PM Igor Vaiman notifications@github.com wrote:
I think I have finally tracked down the problem, and it indeed has something to do with Python. Sometimes with no apparent reason C string stored inside Python extension module becomes corrupted, and it must be messing up INDIGO's execution. Right before the crash I get the following trace log:
23:38:24.788958 Application: INDIGO Bus: property removal
23:38:24.789005 Application: 'CCD Imager Simulator'.'' TEXT ro Ok 2.0 0 {
23:38:24.789026 Application: }
23:38:24.789062 Application: indigo_ccd_simulator: 'CCD Imager Simulator (wheel)' detached
23:38:24.789101 Application: INDIGO Bus: property removal
23:38:24.789147 Application: '�[k)�'.'CCD_INFO' NUMBER ro Ok 2.0 0 {
23:38:24.789167 Application: }
23:38:24.789206 Application: INDIGO Bus: property removal
Segmentation fault (core dumped)
I'm sorry I have wasted your time! When I figure it out I hope to provide a fully functional Python interface for your amazing framework though!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/indigo-astronomy/indigo/issues/377#issuecomment-707199465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5EZBOK4E5HL5OEJFDYAOTSKMQJFANCNFSM4SKEIZ4Q .
@rumengb I'm not sure what you mean by 'bindings', but my current goal is an Indigo client wrapped into a Python module. It's intended use is to communicate with Indigo devices connected to the same machine from Python code. I'm currently trying to satisfy my own needs with this project, but I will definitely check your TODO list later!
I know where the issue comes from. Please send me an email to rumen@skyarchive.org to discuss this. I will update the documentation to address this pitfall... There is a delete request that deletes all properties which is issued when device is attached.
Reopening: There is a race between indigo_device_disconnect() and indigo_remove_driver(). A workaround is small a delay between them, but this is not a solution...
@rumengb Thank you very much! I'll add this as a temporary fix.
The bug in INDIGO is fixed in this commit d864605a315fd4242d61117311a6db15a498c727, but there is still an issue in your code. You should always test the result of indigo_remove_driver() and use something like this:
while (indigo_remove_driver(driver) != INDIGO_OK) indigo_usleep(ONE_SECOND_DELAY);
So if it fails (in this case because the asynchronous "disconnect" operation is still in progress), you have to repeat later.
No it is not fixed :(. It still happens there is a race between:
CONNECTION_PROPERTY->state = INDIGO_OK_STATE; and indigo_ccd_change_property(device, NULL, CONNECTION_PROPERTY);
after CONNECTION_PROPERTY->state = INDIGO_OK_STATE here the device is assumed disconnected and if driver shutdown kicks in before indigo_ccd_change_property(device, NULL, CONNECTION_PROPERTY);
it fails...
it is fixed now but your code should be changed with something like:
indigo_device_disconnect(&client, CCD_SIMULATOR); while (INDIGO_OK != indigo_remove_driver(driver)) { indigo_usleep(10000); } indigo_detach_client(&client); indigo_stop();
as disconnect is asynchronous and indigo_remove_driver() fails with INDIGO_BUSY if some device is connected.
I'm writing a Python interface for INDIGO client (based on
dynamic_driver_client
from examples), but I've run into unusual issue.I successfully run my module, attach CCD Simulator, start exposure and process it with Python callback function, but when I run cleanup code (wrapped into Python function, but it doesn't seem to affect anything) I get segfault error. In particular, INDIGO log goes
Weird thing is, sometimes only first two devices detach before error, most of the time three, sometimes four. When I examine core dump with
gdb
I get the following output:When I run
gdb
in VS Code (usingpython my_python_indigo_wrapper.py
as an entry point), Segfault occurs on line 610 inindigo_ccd_simulator.c
, here:device->device_context
whichCONNECTION_PROPERTY
resolves to, shows up as NULLI'm not sure if this issue has something to do with my Python wrapper. On one hand, I can't see any way Python interpreter can influence deep inner INDIGO code. I've encountered Segfault issues due to GIL violation, but there's no Python objects being called from cleanup code! On the other hand, regular dynamic driver client works and my implementation does not!
Soon I will be able to test my code with real camera (ZWO) and will certainly report the results.
I'm running Ubuntu 20.04, Python 3.8, and will be happy to provide further details.