carla-simulator / carla

Open-source simulator for autonomous driving research.
http://carla.org
MIT License
11.57k stars 3.73k forks source link

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast #3540

Open nuomizai opened 4 years ago

nuomizai commented 4 years ago

This error happened on CARLA server when I use leaderboard and scenario runner to create my A3C training environment. Strangely, it appeared a few hours after the start of training. Does anyone know how to solve that?

yasser-h-khalil commented 4 years ago

Same Issue occurring exactly as described! I am using 0.9.10 Have you found a solution to this yet?

nuomizai commented 4 years ago

Same Issue occurring exactly as described! I am using 0.9.10 Have you found a solution to this yet?

Sorry, @yasser-h-khalil . I haven't found the reason and the solution. I used leaderboard and scenario runner. What't is your setting?

yasser-h-khalil commented 4 years ago

This is the statement I use to launch the server: DISPLAY= ./CarlaUE4.sh -opengl -carla-port=2000. I am using RTX5000 with 410.48 driver It works for hours and then crashes with the following error:

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_castterminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast
terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast

Signal 6 caught.
Signal 6 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554 
Signal 6 caught.
CommonUnixCrashHandler: Signal=6
Malloc Size=65535 LargeMemoryPoolOffset=131119 
Malloc Size=123824 LargeMemoryPoolOffset=254960 
Engine crash handling finished; re-raising signal 6 for the default handler. Good bye.
Aborted (core dumped)
nuomizai commented 4 years ago

This is the statement I use to launch the server: DISPLAY= ./CarlaUE4.sh -opengl -carla-port=2000. I am using RTX5000 with 410.48 driver It works for hours and then crashes with the following error:

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_castterminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast
terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast

Signal 6 caught.
Signal 6 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554 
Signal 6 caught.
CommonUnixCrashHandler: Signal=6
Malloc Size=65535 LargeMemoryPoolOffset=131119 
Malloc Size=123824 LargeMemoryPoolOffset=254960 
Engine crash handling finished; re-raising signal 6 for the default handler. Good bye.
Aborted (core dumped)

The error is exactly the same with what I met! I used an old version of leaderboard and scenario runner to train my DRL agent in a distributed manner. I used CARLA 0.9.9.3 by the way. Now I use the latest version of leaderboard and scenario runner from leaderboard and scenario runner and CARLA 0.9.10. I will tell you if that works as soon as the training process finished. Hope this will help you if you have the same setting with me!

yasser-h-khalil commented 4 years ago

Hello @nuomizai, are you using Traffic Manager?

nuomizai commented 4 years ago

Hello @nuomizai, are you using Traffic Manager?

Hey @yasser-h-khalil , sorry for the delay. Yes, I'm using Traffic Manager. Actually, after I used the lastest version of leaderboard and scenario runner, this error gone. Have you figured out the reason for this error?

yasser-h-khalil commented 4 years ago

No, I am still facing this issue.

corkyw10 commented 3 years ago

@glopezdiest could you follow up on this please?

raozhongyu commented 3 years ago

I met the same question, have you solve it

glopezdiest commented 3 years ago

Hey, this issue is probably related to this other one, which is a memory leak issue at the LB. We do know that it exists but we haven't found the problem yet

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

deepcs233 commented 3 years ago

met the same question + 1

qhaas commented 3 years ago

Observed this in the CARLA 0.9.12 container in Ubuntu 18.04 with a consumer Kepler GPU, seems random

grablerm commented 2 years ago

i met the same question, is there any solution?

Kin-Zhang commented 2 years ago

Me too!!!!!

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jhih-ching-yeh commented 2 years ago

met the same question + 1 is there any solution?

hlchen1043 commented 2 years ago

Met the same issue. CARLA 0.9.10, RTX 3090, Ubuntu 20.04.

buesma commented 1 year ago

Same here. CARLA 0.9.13, RTX 3080, Ubuntu 20.04.

AtongWang commented 1 year ago

I also encountered a situation where I would loop through scenarios in my code, which I believe is a serious bug in CARLA. CARLA 0.9.10 RTX8000, Ubuntu18.04, Python 3.7

Unkn0wnH4ck3r commented 1 year ago

Same question here after 1000+ rounds RL training, which i believe is traffic manager error. Any suggestions? Signal 11 caught. Malloc Size=65538 LargeMemoryPoolOffset=65554 CommonUnixCrashHandler: Signal=11 Malloc Size=131160 LargeMemoryPoolOffset=196744 Malloc Size=131160 LargeMemoryPoolOffset=327928 Engine crash handling finished; re-raising signal 11 for the default handler. Good bye. Segmentation fault (core dumped) Any clue why CARLA crashed.

Device Info: GPU: NVIDIA Titan RTX 24G RAM: 64G CPU: i9 9900X

Ubuntu: 20.04.5 CUDA: 11.7 NVIDIA Driver Version: 525.89.02

CurryChen77 commented 12 months ago

I got into the same situation when I tried to train my own RL agent for over 150 epochs. I also used some memory profilers tools, like memory-profiler python mudule and psutil python module, but the memory usage is not growing. So it shouldn't be the problem of memory leaks. Are there any better solutions? Tested on two machines

CMakey commented 11 months ago

the same question , the different is I'm just running the example file , when I run manual_control.py , the UE just crashed and error came.