Open sky77764 opened 4 months ago
Hi!
I have attached the log file for your reference. long_vicuna2_with_notice_t1.json
Could you run it with more times? the variance of the performance scores is large. Also, you can check your Carla version and the start command. If you still encounter the problem, I will check the agent code. Thanks for your attention.
Same here, I cannot reproduce reported results either. Could you please give me some suggestions for this? Thanks for your sincere help!
lmdrive_config.py
import os
class GlobalConfig:
"""base architecture configurations"""
# Controller
turn_KP = 1.25
turn_KI = 0.75
turn_KD = 0.3
turn_n = 40 # buffer size
speed_KP = 5.0
speed_KI = 0.5
speed_KD = 1.0
speed_n = 40 # buffer size
max_throttle = 0.75 # upper limit on throttle signal value in dataset
brake_speed = 0.1 # desired speed below which brake is triggered
brake_ratio = 1.1 # ratio of speed to desired speed at which brake is triggered
clip_delta = 0.35 # maximum change in speed input to logitudinal controller
llm_model = './huggingface/models--liuhaotian--llava-v1.5-7b/snapshots/12e054b30e8e061f423c7264bc97d4248232e965/'
preception_model = 'memfuser_baseline_e1d3_return_feature'
preception_model_ckpt = 'models/vision-encoder-r50.pth.tar'
lmdrive_ckpt = 'models/llava-v1.5-checkpoint.pth'
agent_use_notice = True
sample_rate = 2
def __init__(self, **kwargs):
for k, v in kwargs.items():
setattr(self, k, v)
run_evaluation.sh
export LEADERBOARD_ROOT=leaderboard
export CHALLENGE_TRACK_CODENAME=SENSORS
export PORT=$PT # same as the carla server port
export TM_PORT=$(($PT+500)) # port for traffic manager, required when spawning multiple servers/clients
export DEBUG_CHALLENGE=0
export REPETITIONS=1 # multiple evaluation runs
export ROUTES=langauto/benchmark_long.xml
export TEAM_AGENT=leaderboard/team_code/lmdriver_agent.py # agent
export TEAM_CONFIG=leaderboard/team_code/lmdriver_config.py # model checkpoint, not required for expert
export CHECKPOINT_ENDPOINT=results/sample_result.json # results file
#export SCENARIOS=leaderboard/data/scenarios/no_scenarios.json #town05_all_scenarios.json
export SCENARIOS=leaderboard/data/official/all_towns_traffic_scenarios_public.json
export SAVE_PATH=data/eval # path for saving episodes while evaluating
export RESUME=False
Final results, agent_use_notice=True
"scores": {
"score_composed": 25.870979763631585,
"score_penalty": 0.7629596874513518,
"score_route": 35.450098903630824
},
agent_use_notice=False
"scores": {
"score_composed": 26.58510262333422,
"score_penalty": 0.7749396906256292,
"score_route": 35.97223471729377
},
By the way, the results look similar after multiple times evaluation, the variance is acceptable.
I encountered the same issue. I tried to reproduce the experimental results using the official code and model weights on an RTX 4090. The CARLA version was 0.9.10.1, but the results I obtained were lower than those reported in the paper. I got DS=25.044, RC=36.526.
Thank you for the active sharing of your valuable research.
I am deeply impressed by your research and tested a few things, but I have some questions.
Here are my configurations: leaderboard/teamcode/lmdrive_config.py
leaderboard/scripts/run_evaludation.sh
Are these configurations accurate? If these configurations are incorrect, then the following questions may not be necessary.
agent_use_notice = True
agent_use_notice = False