vlc-robot / hiveformer

24 stars 1 forks source link

Cannot reproduce paper results using the given ckpt #4

Closed aopolin-lv closed 1 month ago

aopolin-lv commented 11 months ago
Hello, hiveformer is prosing cause it uses the instruction drivend history information to predict agent action. However, when I try to replement this wor, I come across an important problem. The result I run on my headless server is followed: seed pick_ and_lift pick_up _cup put_knifeon chopping_board put_money _in_safe push_ button reach_ target slide_block _to_target stack _wine take_money _out_safe take_umbrellaout of_umbrella_stand Avg.
seed=0 89.20 61.00 38.20 21.60 94.60 99.80 41.80 58.40 43.00 34.20 58.26
seed=2 89.40 70.20 63.80 21.00 98.80 99.80 76.40 59.80 52.20 41.60 67.3
seed=4 92.80 72.00 43.60 32.60 93.60 99.80 36.80 76.00 43.80 47.00 63.80

Except the task pick_and_lift and reach_target, other result is far away from the result you report in paper and repo. Could you give me some advice?

I tried three configs and ckpts obtained from transformer_unet+gripper_attn_multi32_300k. The main command is as followed:

export COPPELIASIM_ROOT=/data/project/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$COPPELIASIM_ROOT
export QT_QPA_PLATFORM_PLUGIN_PATH=$COPPELIASIM_ROOT
export WORLD_SIZE=1
export MASTER_ADDR='localhost'
export MASTER_PORT=10000

export LOCAL_RANK=0 
export RANK=0
export CUDA_VISIBLE_DEVICES=0
export DISPLAY=:0.0

outdir=transformer_unet+gripper_attn_multi32_300k/seed4

step=190000
python eval_models.py \
    --exp_config ${outdir}/logs/training_config.yaml \
    --seed 200 \
    --num_demos 500 \
    checkpoint ${outdir}/ckpts/model_step_${step}.pt

CUDA version is: 12.0 Pip list is as followed:

Package                  Version
------------------------ ----------
absl-py                  1.4.0
attrs                    23.1.0
certifi                  2023.7.22
cffi                     1.14.2
charset-normalizer       3.3.0
cycler                   0.12.1
docstring-parser         0.15
einops                   0.6.1
filelock                 3.12.2
fonttools                4.43.1
fsspec                   2023.9.2
html-testRunner          1.2.1
huggingface-hub          0.18.0
idna                     3.4
Jinja2                   3.1.2
jsonlines                3.1.0
kiwisolver               1.4.5
lmdb                     1.4.1
MarkupSafe               2.1.3
matplotlib               3.5.1
msgpack                  1.0.7
msgpack-numpy            0.4.8
msgpack-python           0.5.6
mypy-extensions          1.0.0
natsort                  8.4.0
numpy                    1.23.5
nvidia-cublas-cu11       11.10.3.66
nvidia-cuda-nvrtc-cu11   11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11        8.5.0.96
opencv-python-headless   4.7.0.72
packaging                23.2
Pillow                   9.4.0
pip                      23.2.1
protobuf                 3.20.3
pycparser                2.21
pyparsing                3.1.1
pyquaternion             0.9.9
PyRep                    4.1.0.3
python-dateutil          2.8.2
PyYAML                   6.0.1
regex                    2023.10.3
requests                 2.31.0
rlbench                  1.3.0
setuptools               68.0.0
six                      1.16.0
tensorboardX             2.6
tokenizers               0.12.1
torch                    1.13.0
torchvision              0.14.0
tqdm                     4.62.3
transformers             4.19.4
typed-argument-parser    1.8.0
typing_extensions        4.8.0
typing-inspect           0.9.0
urllib3                  2.0.6
wheel                    0.41.2
yacs                     0.1.8
cshizhe commented 11 months ago

I would suggest you visualize the rendered images during the evaluation. We noticed that there might be some problems of running RLBench simulator in a headless machine, where the rendered images are not correct.