Open Hellod035 opened 2 months ago
Hi, this does occur during the training of the high-level policy, but it currently doesn't seem to affect the results. We plan to address this issue later, so for now, you can consider it acceptable.
Thank you very much for your reply :)
Steps to reproduce: Increase
max_epochs
inskillmimic/data/cfg/train/rlg/hrl_humanoid_discrete_layupscore.yaml
and runthen you will see "NaN or Inf found in input tensor" in terminal, it actually because of some of the KL divergence being inf. I would like to ask if this phenomenon has been noticed, whether this is allowed or whether the hyperparameters need further adjustment.