Open zhufq00 opened 9 months ago
I found this dataset in huggingface.
Here are some tips for run this project:
Tips 1: Dataset download in huggingface
change this command in makefile
git submodule update --init --recursive ./submodules/hulc-data;\
to
git submodule update --init --recursive ./submodules/hulc-data
cd submodules/hulc-data/hulc-trajectories && git lfs pull
cd submodules/hulc-data/lcd-seeds && git lfs pull
cd submodules/hulc-data/hulc-baselines-30 && git lfs pull
Tips 2: export PATH="$HOME/.local/bin:$PATH" when you can not find poetry, use this to verify "poetry --version"
One of my feedback points is that this paper is quite difficult to understand, with many unnecessary concepts and formulas introduced. At the same time, the details regarding the input and output of the model are unclear, although it might be because I am not very familiar with these terms. I was able to slightly understand it after viewing https://diffusion-planning.github.io, and I also find http://hulc.cs.uni-freiburg.de to be quite complex.
I would like to ask about some details to ensure my understanding of the paper, primarily about the inputs and outputs of the High-level and Low-level policies.
My assumption is that during the training process, the input to the High-level policy is:
The output of the High-level policy is:
During the inference process, the input to the High-level policy is:
The output of the High-level policy is:
I assume that the High-level policy and Low-level policy are trained separately. We first train the Low-level policy without diffusion process to get the frozen low-level policy encoder.
The inputs and outputs of the Low-level policy in HULC during training and inference are as shown in the image.
But it seems that HULC generate a sequence of action instead of one action in LCD.
How LLP is trained and how LLP inference? I can not find Appendix E, https://imgur.com/MwzAO6s, it seems that this section is very important to understand LLP.
How is T determined? Is it a fixed parameter or something else?
Additionally, there is a typo in Section 4.6 "ROBUSTNESS TO HYPERPARAMETERS" – "??".
Thank you very much for your work. Although it seems a bit difficult to understand, it's very intriguing!
I am looking forward to your evaluation of whether my understanding is accurate.
I did everything well except when I run lcd train_hulc
I can not find the right 12_all_trajectories.pt in https://github.com/ezhang7423/hulc-data.git too. The dir "hulc-trajectories" is empty.
Awesome job and thank you very much for your response!