coin3d / pivy

python bindings to coin3d
ISC License
53 stars 37 forks source link

Sensor callback (threading) still crashes with 64-bit Python3 on 64-bit Windows despite fix #8

Closed InventorMentor closed 7 years ago

InventorMentor commented 7 years ago

Issue: Sensor callback still crashes for me using 64-bit Python3 on 64-bit Microsoft Windows

Scenario: despite using the fixes in branch "threads", the sensor callback problem still occurs using the following two configurations:

The test cases used where the following two Inventor Mentor examples:

Request: Could you please re-test the sensor callback threading fix also on a 64-bit Python3 configuration on a 64-bit Windows system and confirm that the current fix is working for you.

Many thanks in advance! Best wishes, Peter

looooo commented 7 years ago

I have no windows available right now. Is there any back-trace available? (gdb)

InventorMentor commented 7 years ago

Hi, many thanks for your swift reply. Unfortunately, I don't have access to a 64-bit Linux system running 64-bit Python 3.4 or newer. Did you have any chance to test the sensor threading fix on 64-bit Python3 on Linux?

With 64-bit Python 3.6.1 on 64-bit Windows 10 I receive the following error message shown after attaching the debugger of MSVC14 to a running instance of 12.3.AlarmSensor.py after the crash has occured. (this happens 12 secs after starting the example script):

"Unhandled exception at 0x00007FFD7D8CCE50 (_coin.cp36-win_amd64.pyd) in python.exe: 0xC0000005: Access violation reading location 0x0000000000000000."

Top of callstack (excerpt): coin4.dll!SoSensorManager::processTimerQueue() Line 463 C++ _coin.cp36-win_amd64.pyd!00007ffd7d887019() Unknown

Auto variables:

Depicted Coin 4.0.0a source code (marked line 463 below):

 /*!
  Trigger all the timers which has expired.
 */
void
SoSensorManager::processTimerQueue(void)
{
  SoSensorManagerP::assertAlive(PRIVATE(this));

  if (PRIVATE(this)->processingtimerqueue || PRIVATE(this)->timerqueue.getLength() == 0)
    return;

#if DEBUG_TIMER_SENSORHANDLING // debug
  SoDebugError::postInfo("SoSensorManager::processTimerQueue",
                         "start: %d elements", PRIVATE(this)->timerqueue.getLength());
#endif // debug

  assert(PRIVATE(this)->reschedulelist.getLength() == 0);
  PRIVATE(this)->processingtimerqueue = TRUE;

  LOCK_TIMER_QUEUE(this);

  SbTime currenttime = SbTime::getTimeOfDay();
  while (PRIVATE(this)->timerqueue.getLength() > 0 &&       // <-- this is line 463  where the crash occurs when the timer is triggered
         PRIVATE(this)->timerqueue[0]->getTriggerTime() <= currenttime) {
#if DEBUG_TIMER_SENSORHANDLING // debug
    SoDebugError::postInfo("SoSensorManager::processTimerQueue",
                           "process element with triggertime %s",
                           PRIVATE(this)->timerqueue[0]->getTriggerTime().format().getString());
#endif // debug
    SoSensor * sensor = PRIVATE(this)->timerqueue[0];
    PRIVATE(this)->timerqueue.remove(0);
    UNLOCK_TIMER_QUEUE(this);
    sensor->trigger();
    LOCK_TIMER_QUEUE(this);
  }

  UNLOCK_TIMER_QUEUE(this);

#if DEBUG_TIMER_SENSORHANDLING // debug
  SoDebugError::postInfo("SoSensorManager::processTimerQueue",
                         "end, before merge: %d elements",
                         PRIVATE(this)->timerqueue.getLength());
#endif // debug

  LOCK_RESCHEDULE_LIST(this);
  int n = PRIVATE(this)->reschedulelist.getLength();
  if (n) {
    SbTime time = SbTime::getTimeOfDay();
    for (int i = 0; i < n; i++) {
      PRIVATE(this)->reschedulelist[i]->reschedule(time);
    }
    PRIVATE(this)->reschedulelist.truncate(0);
  }
  UNLOCK_RESCHEDULE_LIST(this);

  PRIVATE(this)->processingtimerqueue = FALSE;

#if DEBUG_TIMER_SENSORHANDLING // debug
  SoDebugError::postInfo("SoSensorManager::processTimerQueue",
                         "end, after merge: %d elements",
                         PRIVATE(this)->timerqueue.getLength());
#endif // debug
}
looooo commented 7 years ago

I can confirm the crash with example 12.3 with linux64 and the following libraries:

the crash doesn't occur with:

looooo commented 7 years ago

Request: Could you please re-test the sensor callback threading fix also on a 64-bit Python3 configuration on a 64-bit Windows system and confirm that the current fix is working for you.

Are you talking about these two commits? https://github.com/looooo/pivy/commit/e35d0d99052feb7d85b8f25a825b3cbb8d9e6a36 https://github.com/looooo/pivy/commit/f2883bb4fa305963eb082f1a34c47ac4747569b7

looooo commented 7 years ago

further testing with freecad results in the same problem. freecad-py2(current master-development version) and freecad with python3.

InventorMentor commented 7 years ago

Many thanks for your confirmation on linux64 -python3.5 (conda). Here, I'm refering to commit 4e0fb106442ee4459f44afa6b2e2c67fa153d364

However, without this fix in branch "threads" referenced above, I can also confirm that 12.3.AlarmSensor.py works fine without applying the threading fix on a 32-bit Debian 7 system (3.2.86-1 i686 GNU/Linux) just with the following default Coin/Pivy packages installed from Debian repositories:

My issues are on both 64-Bit Windows 7 and 10, where I've used swigwin-3.0.11 to build the current "threads" branch of Pivy 0.6.0 (commit referenced above) employing

In Pivy's setup.py modified in the "threads" branch, I've found that if I leave the "-threads" option in SWIG_PARAMS list, none of the Inventor Mentor examples is going to work for me and if I remove it from SWIG_PARAMS all Ithisnventor Mentor examples run fine on 64-Bit Windows 7/10 without those in chapter 12 (employing sensor callbacks). I also found that within file coin_wrap.cpp generated by SWIG 3.0.11 there is no call PyEval_InitThreads() if the "-threads" option is removed from SWIG_PARAMS. My manual attempts to insert it in coin_wrap.cpp after module creation / initialization were without success regarding the sensor examples in chapter 12. The GIL seems not to be properly initialized. I tried to follow the suggestion in

http://stackoverflow.com/questions/16606872/calling-python-method-from-c-or-c-callback

as given at

http://stackoverflow.com/a/16609899

looooo commented 7 years ago

Thanks for your help, debugging this issue.

With linux I think the problem is related to some unicode to string conversations. I am not really sure but with gdb I get this output.

#2  SWIG_TypeQueryModule (name=0x0, end=0x7fffc1d8d9c0 <swig_module>, start=0x7fffc1d8d9c0 <swig_module>) at pivy/coin_wrap.cpp:625
#3  0x00007fffc1381db3 in SoSensorPythonCB (data=0x7fffc8954318, sensor=0xe34750)
   from /home/lo/conda/envs/freecad/lib/python3.5/site-packages/pivy/_coin.cpython-35m-x86_64-linux-gnu.so
#4  0x00007ffff50b5ef9 in SoSensor::trigger() () from /home/lo/conda/envs/freecad/bin/../lib/libCoin.so
#5  0x00007ffff50b87f1 in SoTimerQueueSensor::trigger() () from /home/lo/conda/envs/freecad/bin/../lib/libCoin.so

name is created with PyBytes_AsString in python3 and PyString_AsString in python2. As this is working with python2 I think the root of the problem could be related to this.

looooo commented 7 years ago

for me this is now working on linux 64 and this master-branch. Maybe you can try it with windows.

InventorMentor commented 7 years ago

Many thanks! Your commit successfully fixes the following three Inventor Mentor examples from chapter 12 me (using 64-bit Python 3.6.1 on 64-bit Windows 10):

12.1.FieldSensor.py, 12.2.NodeSensor.py, 12.3.AlarmSensor.py

Running the final example from chapter 12, however,

python 12.4.TimerSensor.py bird.iv

still results in an "Unhandled exception at 0x00007FFAFA528283 (ntdll.dll) in python.exe: 0xC0000374: A heap was corrupted (parameters: 0x00007FFAFA57F6B0)."

Auto variables:

Example 12.4 also fails for me on 32-bit Python 2.7.3 (under 32-bit Debian 7, i686, Coin 3.1.3, Pivy 0.5.0) with a similar backtrace: (gdb) run 12.4.TimerSensor.py bird.iv [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1". glibc detected /usr/bin/python: free(): invalid pointer: 0x087babc4 *** ======= Backtrace: ========= /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x6ff41)[0xb7e53f41] /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x717a8)[0xb7e557a8] /lib/i386-linux-gnu/i686/cmov/libc.so.6(cfree+0x6d)[0xb7e588ed] /usr/lib/i386-linux-gnu/libstdc++.so.6(_ZdlPv+0x1f)[0xb62e84bf] [0x0] ======= Memory map: ======== 08048000-0828d000 r-xp 00000000 08:01 136900 /usr/bin/python2.7 0828d000-0828e000 r--p 00244000 08:01 136900 /usr/bin/python2.7 0828e000-082e3000 rw-p 00245000 08:01 136900 /usr/bin/python2.7 082e3000-089cd000 rw-p 00000000 00:00 0 [heap] b0300000-b0321000 rw-p 00000000 00:00 0 b0321000-b0400000 ---p 00000000 00:00 0 b0465000-b05a6000 rw-p 00000000 00:00 0 b05a6000-b0fa6000 rwxp 00000000 00:00 0 b0fa6000-b10e7000 rw-p 00000000 00:00 0 b10e7000-b1167000 rwxp 00000000 00:00 0 b1167000-b1ed9000 r-xp 00000000 08:01 268229 /usr/lib/i386-linux-gnu/dri/swrast_dri.so ... b7fde000-b7fdf000 rw-p 00002000 08:01 265963 /usr/lib/python2.7/lib-dynload/dl.so b7fdf000-b7fe1000 rw-p 00000000 00:00 0 b7fe1000-b7fe2000 r-xp 00000000 00:00 0 [vdso] b7fe2000-b7ffe000 r-xp 00000000 08:01 264670 /lib/i386-linux-gnu/ld-2.13.so b7ffe000-b7fff000 r--p 0001b000 08:01 264670 /lib/i386-linux-gnu/ld-2.13.so b7fff000-b8000000 rw-p 0001c000 08:01 264670 /lib/i386-linux-gnu/ld-2.13.so bffdf000-c0000000 rw-p 00000000 00:00 0 [stack]

Program received signal SIGABRT, Aborted. 0xb7fe1428 in __kernel_vsyscall ()

looooo commented 7 years ago

just tried example12.4. there is something wrong with the __imul__-operator inside of the callback-function.

switching the function from:

def rotatingSensorCallback(myRotation, sensor):
    # Rotate an object...
    currentRotation = myRotation.rotation.getValue()
    currentRotation *= SbRotation(SbVec3f(0,0,1), M_PI/90.0)
    myRotation.rotation.setValue(currentRotation)

to

def rotatingSensorCallback(myRotation, sensor):
    # Rotate an object...
    currentRotation = myRotation.rotation.getValue()
    myRotation.rotation.setValue(currentRotation * SbRotation(SbVec3f(0,0,1), M_PI/90.0))

works for me.

I will look into this. Thanks for reporting.

looooo commented 7 years ago

https://github.com/looooo/pivy/commit/d9c224a366b5294e8ca75d43f4606cb6525bab80 L25L28 should fix it.

If you find any other problems with the examples (I am quite sure there are more) it would be nice to create new issues.

InventorMentor commented 7 years ago

Many thanks! I can confirm that all examples in chapter 12 now also work on Windows 10, too. I'm happy!