Open CAS-LRJ opened 1 year ago
I tried to convert the torch model into TorchScript by myself. The model was successfully loaded but the agent performance is not improved. I use the following code to convert brake prediction model and segmentation model into TorchScript.
import torch
from models.rgb import RGBSegmentationModel, RGBBrakePredictionModel
input1 = torch.rand(1, 3, 288, 768).to('cuda')
input2 = torch.rand(1, 3, 192, 480).to('cuda')
brake_model = RGBBrakePredictionModel([4,10,18]).to('cuda')
brake_model.load_state_dict(torch.load('../weights/bra_v2_9.th'))
traced_bra_model = torch.jit.trace(bra_model, (input1, input2))
traced_bra_model.save('traced_bra_model_v2.pt')
seg_model = RGBSegmentationModel([4,6,7,10]).to('cuda')
input3 = torch.rand(3, 3, 288, 256).to('cuda')
seg_model.load_state_dict(torch.load('../weights/seg_1.th'))
traced_seg_model = torch.jit.trace(seg_model, input3)
traced_seg_model.save('traced_seg_model.pt')
It shows several warnings but no errors. I wonder if I am correct or I just missed something important? Btw, the hardware I am using is a laptop with I9-12900hx, 32G RAM and RTX 3080Ti
Thanks for reporting this! Could you try again the default .pt files (updated) at your convenience? Did you see any speed difference with the fast agent on your setup? In my case (titan xp + e5-2630 v3) I consistently see 1.5-2x speedup.
Hello, the following error occurs with updated .pt files.
Traceback (most recent call last):
File "LAV/leaderboard/leaderboard/autoagents/autonomous_agent.py", line 115, in __call__
control = self.run_step(input_data, timestamp)
File "miniconda3/envs/LAV-env2/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "LAV/team_code_v2/lav_agent_fast.py", line 265, in run_step
fused_lidar = self.infer_model.forward_paint(cur_lidar, pred_sem)
File "LAV/team_code_v2/model_inference.py", line 46, in forward_paint
painted_lidar = self.point_painting(cur_lidar, pred_sem)
File "LAV/team_code_v2/model_inference.py", line 87, in point_painting
lidar_cam = lidar_cam[valid_idx]
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Do you use multiple cards in your setup?
In my setup, the default agent has nearly 0.6 simulation ratio but fast agent has ratio below 0.5 The simulation ratio I mentioned is the ratio of the simulation time to the real time.
Maybe the new CPU is strong enough to handle the point painting task. I am gonna try the TensorRT to boost the inference speed. I will update the result once I finished the experiment.
Could you try this command and see if it works? I just tried this on my setup and it works
ROUTES=assets/routes_lav_valid.xml TEAM_AGENT=$HOME/LAV/team_code_v2/lav_agent_fast TEAM_CONFIG=$HOME/LAV/team_code_v2/config.yaml ./leaderboard/scripts/run_evaluation.sh
========= Preparing RouteScenario_0 (repetition 0) =========
> Setting up the agent
> Loading the world
Base transform is blocking objects Transform(Location(x=185.695465, y=257.345886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.695114, y=257.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.694778, y=258.345886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.694443, y=258.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.694092, y=259.345886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.693756, y=259.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Skipping scenario 'Scenario4' due to setup error: Error: Unable to spawn vehicle vehicle.diamondback.century at Transform(Location(x=185.693756, y=259.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
> Running the route
======[Agent] Wallclock_time = 2022-09-14 12:08:22.509575 / 0.0 / Sim_time = 0.05000000074505806 / 50.00000074505806x
.....
======[Agent] Wallclock_time = 2022-09-14 12:18:09.936383 / 17.505114 / Sim_time = 3.8000000566244125 / 0.21706702336248995x
======[Agent] Wallclock_time = 2022-09-14 12:18:10.129409 / 17.69814 / Sim_time = 3.8500000573694706 / 0.21752469653155299x
======[Agent] Wallclock_time = 2022-09-14 12:18:10.326214 / 17.894945 / Sim_time = 3.9000000581145287 / 0.21792646647687666x
======[Agent] Wallclock_time = 2022-09-14 12:18:10.511506 / 18.080237 / Sim_time = 3.9500000588595867 / 0.21845850805780526x
======[Agent] Wallclock_time = 2022-09-14 12:18:10.706365 / 18.275096 / Sim_time = 4.000000059604645 / 0.21886512631607125x
======[Agent] Wallclock_time = 2022-09-14 12:18:10.901973 / 18.470704 / Sim_time = 4.050000060349703 / 0.21925427455689536x
======[Agent] Wallclock_time = 2022-09-14 12:18:11.094072 / 18.662803 / Sim_time = 4.100000061094761 / 0.2196765611539492x
======[Agent] Wallclock_time = 2022-09-14 12:18:11.276407 / 18.845138 / Sim_time = 4.150000061839819 / 0.22020427006529503x
======[Agent] Wallclock_time = 2022-09-14 12:18:11.477250 / 19.045981 / Sim_time = 4.200000062584877 / 0.22050738973199357x
======[Agent] Wallclock_time = 2022-09-14 12:18:11.669667 / 19.238398 / Sim_time = 4.250000063329935 / 0.22090088594923474x
======[Agent] Wallclock_time = 2022-09-14 12:18:11.864959 / 19.43369 / Sim_time = 4.300000064074993 / 0.2212538540143935x
======[Agent] Wallclock_time = 2022-09-14 12:18:12.057615 / 19.626346 / Sim_time = 4.350000064820051 / 0.22162956035013856x
Compared to the default agent (sim/real ~ 0.15x) the fast agent I see with my setup is usually above 0.22x (to 0.25x depending on routes). But in your setup if you find the default one works faster, yes it probably means point painting on your CPU is faster than GPU + the overhead since I think torchscript the brake and segmentation models should accelerate regardless of hardware platform.
I am not using multiple cards here for inference. Curious to see if wrapping model_inference
to tensorrt speeds it up!
Error occurs
========= Preparing RouteScenario_0 (repetition 0) =========
> Setting up the agent
Could not set up the required agent:
> No CUDA GPUs are available
You choose the second GPU on your machine with CUDA_VISIBLE_DEVICES="1"
.
According to the document of torch.jit.load
, the TorchScript will be moved to devices they were saved from. My laptop only has 1 GPU, this may cause the error. Could you please save the TorchScript on the first GPU "cuda:0"
?
Oops forgot to remove the CUDA_VISIBLE_DEVICES= part in the command. Could you run it with CUDA_VISIBLE_DEVICES="0"? The torch script jit is saved to device "cuda" in a script with specified CUDA_VISIBLE_DEVICES to one gpu, so I am pretty sure it will work...
========= Preparing RouteScenario_0 (repetition 0) =========
> Setting up the agent
weights/bra_v2_9.pt
> Loading the world
Base transform is blocking objects Transform(Location(x=185.695465, y=257.345886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.695114, y=257.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.694778, y=258.345886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.694443, y=258.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.694092, y=259.345886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Base transform is blocking objects Transform(Location(x=185.693756, y=259.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
Skipping scenario 'Scenario4' due to setup error: Error: Unable to spawn vehicle vehicle.diamondback.century at Transform(Location(x=185.693756, y=259.845886, z=1.210000), Rotation(pitch=0.000000, yaw=360.039185, roll=0.000000))
> Running the route
======[Agent] Wallclock_time = 2022-09-15 14:07:05.915860 / 0.0 / Sim_time = 0.05000000074505806 / 50.00000074505806x
======[Agent] Wallclock_time = 2022-09-15 14:07:05.955299 / 0.039439 / Sim_time = 0.10000000149011612 / 2.4728603944240986x
/miniconda3/envs/LAV-env2/lib/python3.7/site-packages/torch/nn/modules/module.py:1110: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /opt/conda/conda-bld/pytorch_1646755953518/work/aten/src/ATen/native/BinaryOps.cpp:607.)
return forward_call(*input, **kwargs)
======[Agent] Wallclock_time = 2022-09-15 14:07:07.288802 / 1.372942 / Sim_time = 0.15000000223517418 / 0.10917491585174205x
Stopping the route, the agent has crashed:
> CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
File "/Documents/LAV/leaderboard/leaderboard/scenarios/scenario_manager.py", line 152, in _tick_scenario
ego_action = self._agent()
File "/Documents/LAV/leaderboard/leaderboard/autoagents/agent_wrapper.py", line 75, in __call__
return self._agent()
File "/Documents/LAV/leaderboard/leaderboard/autoagents/autonomous_agent.py", line 115, in __call__
control = self.run_step(input_data, timestamp)
File "/miniconda3/envs/LAV-env2/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Documents/LAV/team_code_v2/lav_agent_fast.py", line 265, in run_step
fused_lidar = self.infer_model.forward_paint(cur_lidar, pred_sem)
File "/Documents/LAV/team_code_v2/model_inference.py", line 46, in forward_paint
painted_lidar = self.point_painting(cur_lidar, pred_sem)
File "/Documents/LAV/team_code_v2/model_inference.py", line 87, in point_painting
lidar_cam = lidar_cam[valid_idx]
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Documents/LAV/leaderboard/leaderboard/leaderboard_evaluator.py", line 342, in _load_and_run_scenario
self.manager.run_scenario()
File "/Documents/LAV/leaderboard/leaderboard/scenarios/scenario_manager.py", line 136, in run_scenario
self._tick_scenario(timestamp)
File "/Documents/LAV/leaderboard/leaderboard/scenarios/scenario_manager.py", line 159, in _tick_scenario
raise AgentError(e)
leaderboard.autoagents.agent_wrapper.AgentError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
> Stopping the route
Still with the error illegal memory access
I have also tried the TorchScript converted by myself, it works. So I wonder if it is caused by the difference of the devices (3080Ti vs Titan Xp)
What python and pytorch+cuda versions are you using?
python 3.7.10 h12debd9_4 anaconda
cudatoolkit 11.3.1 h2bc3f7f_2
pytorch 1.11.0 py3.7_cuda11.3_cudnn8.2.0_0 pytorch
Thanks for the info! My guess then is that the pytorch versions might have caused the discrepancy, the .pt trace file was created with pytorch 1.7.1
with cuda tools 10.2
. But since you get it working by creating it yourself that should be good! I will make a note on README mentioning there could be an issue with the versions and point to this thread. Thanks again!
how to produce such the .pt trace file
?
I didn't find the diff between v2 and v1 agent training scripts. Is that mean the two version models can use together? for example, just use v1 model on v2 also fine? ( forget this one. Since it has different on model scripts, it should not use the v1 on v2; through config v2, only seg
is the same model file.
Hello!
The training scripts are different for the bev and full lidar agent (segmentation is identical, the brake model has a different architecture). I'm working on cleaning them up for release, thanks!
Sorry for the delay, let me know if you have run into any issues running the codes!
hello.I also encountered a problem with agent. After running, it displays:Could not set up the required agent:
invalid load key, 'v'. Run command : ROUTES=/home/zhangting/LAV/assets/routes_lav_valid.xml ./leaderboard/scripts/run_evaluation.sh How can I solve it?thank you
Traceback (most recent call last):
File "d:/Anaconda/photo2cartoon-master/3.py", line 4, in
Hello, thanks for the open source of this fantastic work!
I am able to use the default v2 agent. However, I encounter such error when use the fast agent.
Could you please help me to solve this problem? :octocat: