# failing
- `ParallelMultilevelMonteCarlo.cpp` this fails for mpi
[2024-10-24 14:51:53.112] [info] Balancing load across 0 ranks
terminate called after throwing an instance of 'boost::wrapexcept'
what(): No such node (MLMCMC.Scheduling)
[poisson:429676] Process received signal
[poisson:429676] Signal: Aborted (6)
[poisson:429676] Signal code: (-6)
[poisson:429676] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7bb955242520]
[poisson:429676] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7bb9552969fc]
[poisson:429676] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7bb955242476]
[poisson:429676] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7bb9552287f3]
[poisson:429676] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7bb9556a2b9e]
[poisson:429676] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7bb9556ae20c]
[poisson:429676] [ 6] [poisson:429675] An error occurred in MPI_Send
[poisson:429675] reported by process [457965569,0]
[poisson:429675] on communicator MPI_COMM_WORLD
[poisson:429675] MPI_ERR_RANK: invalid rank
[poisson:429675] MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[poisson:429675] and potentially your MPI job)
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7bb9556ae277]
[poisson:429676] [ 7] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7bb9556ae4d8]
[poisson:429676] [ 8] /home/frizzi/Desktop/muq/install/lib/libmuqSamplingAlgorithms.so(_ZN5boost13property_tree11basic_ptreeINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_St4lessIS7_EE9get_childERKNS0_11string_pathIS7_NS0_13id_translatorIS7_EEEE+0x598)[0x7bb955ed4ee8]
[poisson:429676] [ 9] /home/frizzi/Desktop/muq/install/lib/libmuqSamplingAlgorithms.so(_ZN3muq18SamplingAlgorithms25StaticLoadBalancingMIMCMCC2EN5boost13property_tree11basic_ptreeINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEESA_St4lessISA_EEESt10shared_ptrINS0_32ParallelizableMIComponentFactoryEESE_INS0_18StaticLoadBalancerEESE_IN6parcer12CommunicatorEESE_INS_9Utilities14OTF2TracerBaseEE+0x587)[0x7bb955ff4767]
[poisson:429676] [10] ./ParallelMultilevelMonteCarlo(+0x143c2)[0x5d5f4472f3c2]
[poisson:429676] [11] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7bb955229d90]
[poisson:429676] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7bb955229e40]
[poisson:429676] [13] ./ParallelMultilevelMonteCarlo(+0x13205)[0x5d5f4472e205]
[poisson:429676] End of error message
- `FullParallelMultilevelGaussianSampling` in `Example3_MultilevelGaussian/cpp`:
[2024-10-24 14:56:24.708] [debug] Rank: 0
[2024-10-24 14:56:24.708] [info] Balancing load across 0 ranks
[2024-10-24 14:56:24.708] [debug] Rank: 1
[poisson:430094] An error occurred in MPI_Send
[poisson:430094] reported by process [86048769,0]
[poisson:430094] on communicator MPI_COMM_WORLD
[poisson:430094] MPI_ERR_RANK: invalid rank
[poisson:430094] MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[poisson:430094] and potentially your MPI job)
- `SubsamplingTestMultilevelGaussianSampling` in `Example3_MultilevelGaussian/cpp`:
Running with subsampling 0
*** greedy multillevel chain
Setting up level 0
Setting up level 1
terminate called after throwing an instance of 'boost::wrapexcept'
what(): No such node (MLMCMC.Subsampling_0)
Running with subsampling 0
*** greedy multillevel chain
Setting up level 0
Setting up level 1
terminate called after throwing an instance of 'boost::wrapexcept'
what(): No such node (MLMCMC.Subsampling_0)
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpirun noticed that process rank 1 with PID 0 on node poisson exited on signal 6 (Aborted).
- `FullParallelMultiindexGaussianSampling` in `Example4_MultiindexGaussian/cpp`:
[2024-10-24 15:01:39.446] [debug] Rank: 0
[2024-10-24 15:01:39.446] [debug] Rank: 1
[2024-10-24 15:01:39.446] [info] Balancing load across 0 ranks
[poisson:430492] An error occurred in MPI_Send
[poisson:430492] reported by process [78839809,0]
[poisson:430492] on communicator MPI_COMM_WORLD
[poisson:430492] MPI_ERR_RANK: invalid rank
[poisson:430492] MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[poisson:430492] and potentially your MPI job)
In examples, currently there are the following tests that use the parallel features of MUQ and some fail
passing
ModelParallelMultilevelGaussianSampling
inExample3_MultilevelGaussian/cpp
:*** single chain reference Starting single chain MCMC sampler... 10% Complete Block 0: MHKernel acceptance Rate = 33% 20% Complete Block 0: MHKernel acceptance Rate = 36% 30% Complete Block 0: MHKernel acceptance Rate = 38% 40% Complete Block 0: MHKernel acceptance Rate = 38% 50% Complete Block 0: MHKernel acceptance Rate = 37% 60% Complete Block 0: MHKernel acceptance Rate = 37% 70% Complete Block 0: MHKernel acceptance Rate = 37% 80% Complete Block 0: MHKernel acceptance Rate = 38% 90% Complete Block 0: MHKernel acceptance Rate = 37% 100% Complete Block 0: MHKernel acceptance Rate = 37% Completed in 0.00816509 seconds. mean QOI: 0.838309 1.7598
[2024-10-24 14:51:53.112] [info] Balancing load across 0 ranks terminate called after throwing an instance of 'boost::wrapexcept'
what(): No such node (MLMCMC.Scheduling)
[poisson:429676] Process received signal
[poisson:429676] Signal: Aborted (6)
[poisson:429676] Signal code: (-6)
[poisson:429676] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7bb955242520]
[poisson:429676] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7bb9552969fc]
[poisson:429676] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7bb955242476]
[poisson:429676] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7bb9552287f3]
[poisson:429676] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7bb9556a2b9e]
[poisson:429676] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7bb9556ae20c]
[poisson:429676] [ 6] [poisson:429675] An error occurred in MPI_Send
[poisson:429675] reported by process [457965569,0]
[poisson:429675] on communicator MPI_COMM_WORLD
[poisson:429675] MPI_ERR_RANK: invalid rank
[poisson:429675] MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[poisson:429675] and potentially your MPI job)
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7bb9556ae277]
[poisson:429676] [ 7] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7bb9556ae4d8]
[poisson:429676] [ 8] /home/frizzi/Desktop/muq/install/lib/libmuqSamplingAlgorithms.so(_ZN5boost13property_tree11basic_ptreeINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_St4lessIS7_EE9get_childERKNS0_11string_pathIS7_NS0_13id_translatorIS7_EEEE+0x598)[0x7bb955ed4ee8]
[poisson:429676] [ 9] /home/frizzi/Desktop/muq/install/lib/libmuqSamplingAlgorithms.so(_ZN3muq18SamplingAlgorithms25StaticLoadBalancingMIMCMCC2EN5boost13property_tree11basic_ptreeINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEESA_St4lessISA_EEESt10shared_ptrINS0_32ParallelizableMIComponentFactoryEESE_INS0_18StaticLoadBalancerEESE_IN6parcer12CommunicatorEESE_INS_9Utilities14OTF2TracerBaseEE+0x587)[0x7bb955ff4767]
[poisson:429676] [10] ./ParallelMultilevelMonteCarlo(+0x143c2)[0x5d5f4472f3c2]
[poisson:429676] [11] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7bb955229d90]
[poisson:429676] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7bb955229e40]
[poisson:429676] [13] ./ParallelMultilevelMonteCarlo(+0x13205)[0x5d5f4472e205]
[poisson:429676] End of error message
[2024-10-24 14:56:24.708] [debug] Rank: 0 [2024-10-24 14:56:24.708] [info] Balancing load across 0 ranks [2024-10-24 14:56:24.708] [debug] Rank: 1 [poisson:430094] An error occurred in MPI_Send [poisson:430094] reported by process [86048769,0] [poisson:430094] on communicator MPI_COMM_WORLD [poisson:430094] MPI_ERR_RANK: invalid rank [poisson:430094] MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [poisson:430094] and potentially your MPI job)
Running with subsampling 0
*** greedy multillevel chain
Setting up level 0 Setting up level 1 terminate called after throwing an instance of 'boost::wrapexcept'
what(): No such node (MLMCMC.Subsampling_0)
Running with subsampling 0
*** greedy multillevel chain
Setting up level 0 Setting up level 1 terminate called after throwing an instance of 'boost::wrapexcept'
what(): No such node (MLMCMC.Subsampling_0)
Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
mpirun noticed that process rank 1 with PID 0 on node poisson exited on signal 6 (Aborted).
[2024-10-24 15:01:39.446] [debug] Rank: 0 [2024-10-24 15:01:39.446] [debug] Rank: 1 [2024-10-24 15:01:39.446] [info] Balancing load across 0 ranks [poisson:430492] An error occurred in MPI_Send [poisson:430492] reported by process [78839809,0] [poisson:430492] on communicator MPI_COMM_WORLD [poisson:430492] MPI_ERR_RANK: invalid rank [poisson:430492] MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [poisson:430492] and potentially your MPI job)