Hello, I wanna ask that in line 67 in your trpo.py, you will get two terms, and in the TRPO paper, he said the second term vanishes ?, and you add v*damping, I guess its function is to make sure the positive definiteness? , could you explain it in detail? thank you very much!
and in your line 117 in your main.py, could you explain why this can approximate the average KL in detail? thank you very much!
Hello, I wanna ask that in line 67 in your trpo.py, you will get two terms, and in the TRPO paper, he said the second term vanishes ?, and you add v*damping, I guess its function is to make sure the positive definiteness? , could you explain it in detail? thank you very much! and in your line 117 in your main.py, could you explain why this can approximate the average KL in detail? thank you very much!