deeplearning4j / deeplearning4j-examples

Deeplearning4j Examples (DL4J, DL4J Spark, DataVec)
http://deeplearning4j.konduit.ai
Other
2.46k stars 1.82k forks source link

A3CCartpole `NSInternalInconsistencyException', reason: 'nextEventMatchingMask should only be called from the Main Thread!' #1022

Open mihaita-tinta opened 3 years ago

mihaita-tinta commented 3 years ago

'NSInternalInconsistencyException', reason: 'nextEventMatchingMask should only be called from the Main Thread!'

When running A3CCartpole.main() example (master branch), it trains the model for some time, but it ends up with the error below when trying to play the game with render = true

12:11:44.885 [main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [CpuBackend] backend
12:11:46.012 [main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 1
12:11:46.017 [main] WARN org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - *********************************** CPU Feature Check Warning ***********************************
12:11:46.017 [main] WARN org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - Warning: Initializing ND4J with Generic x86 binary on a CPU with AVX/AVX2 support
12:11:46.017 [main] WARN org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - Using ND4J with AVX/AVX2 will improve performance. See deeplearning4j.org/cpu for more details
12:11:46.017 [main] WARN org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - Or set environment variable ND4J_IGNORE_AVX=true to suppress this warning
12:11:46.017 [main] WARN org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - *************************************************************************************************
12:11:46.467 [main] INFO org.nd4j.nativeblas.Nd4jBlas - Number of threads used for OpenMP BLAS: 4
12:11:46.739 [main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CPU]; OS: [Mac OS X]
12:11:46.739 [main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [8]; Memory: [4.0GB];
12:11:46.739 [main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [OPENBLAS]
12:11:46.915 [main] INFO org.deeplearning4j.nn.multilayer.MultiLayerNetwork - Starting MultiLayerNetwork with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
12:11:47.035 [main] INFO org.deeplearning4j.rl4j.learning.async.AsyncLearning - AsyncLearning training starting.
12:11:47.064 [Thread-2] INFO org.deeplearning4j.rl4j.learning.async.AsyncThread - ThreadNum-0 Started!
12:11:47.072 [Thread-3] INFO org.deeplearning4j.rl4j.learning.async.AsyncThread - ThreadNum-1 Started!
.................
2021-03-20 12:20:28.369 java[3857:5410057] ApplePersistenceIgnoreState: Existing state will not be touched. New state will be written to (null)
2021-03-20 12:20:28.375 java[3857:5410057] WARNING: NSWindow drag regions should only be invalidated on the Main Thread! This will throw an exception in the future. Called from (
    0   AppKit                              0x00007fff2bd31629 -[NSWindow(NSWindow_Theme) _postWindowNeedsToResetDragMarginsUnlessPostingDisabled] + 371
    1   AppKit                              0x00007fff2bd19052 -[NSWindow _initContent:styleMask:backing:defer:contentView:] + 1416
    2   AppKit                              0x00007fff2bd18ac3 -[NSWindow initWithContentRect:styleMask:backing:defer:] + 42
    3   _ctypes.cpython-37m-darwin.so       0x00000001476dce97 ffi_call_unix64 + 79
    4   ???                                 0x000070000c0cf0d0 0x0 + 123145504485584
)
2021-03-20 12:20:28.399 java[3857:5410050] *** Assertion failure in void assertRunningOnAppKitThread(void)(), /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/ViewBridge/ViewBridge-467/ViewBridgeUtilities.m:900
2021-03-20 12:20:28.432 java[3857:5410057] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'nextEventMatchingMask should only be called from the Main Thread!'
*** First throw call stack:
(
    0   CoreFoundation                      0x00007fff2eb0eb57 __exceptionPreprocess + 250
    1   libobjc.A.dylib                     0x00007fff6798a5bf objc_exception_throw + 48
    2   AppKit                              0x00007fff2bd02375 +[NSEvent _discardTrackingAndCursorEventsIfNeeded] + 0
    3   _ctypes.cpython-37m-darwin.so       0x00000001476dce97 ffi_call_unix64 + 79
    4   ???                                 0x000070000c0d2d00 0x0 + 123145504500992
)
libc++abi.dylib: terminating with uncaught exception of type NSException

From what I see, the exception comes from here:GymEnv<OBSERVATION extends Encodable, A, AS extends ActionSpace<A>> implements MDP<OBSERVATION, A, AS>.step(A action)

image

Version Information

Please indicate relevant versions, including, if relevant:

Contributing

I can try to help from a Java code side.

saudet commented 3 years ago

That looks like a problem with the display on Mac. It usually doesn't work unless it runs on the main thread. You can work around that by setting the "render" of GymEnv to false.