Closed ZiyueWang25 closed 1 year ago
> assert reward_improvement.is_significant_reward_improvement( rewards_before, # type:ignore[arg-type] rewards_after, # type:ignore[arg-type] ) E assert False E + where False = <function is_significant_reward_improvement at 0x135cafdc0>([-1379.139569, -1592.006733, -1771.549952, -1345.075819, -1701.436911, -1223.385152, ...], [-1331.957287, -1518.792578, -1274.336042, -1294.565801, -1089.05092, -1652.311479, ...]) E + where <function is_significant_reward_improvement at 0x135cafdc0> = reward_improvement.is_significant_reward_improvement
Relevant CircleCI link: https://app.circleci.com/pipelines/github/HumanCompatibleAI/imitation/3903/workflows/84d03683-66fc-416e-865a-27bb86d78349/jobs/15818
Directly run at master head. pytest tests/algorithms/test_sqil.py::test_sqil_performance_continuous[DDPG]
pytest tests/algorithms/test_sqil.py::test_sqil_performance_continuous[DDPG]
It was found out in https://github.com/HumanCompatibleAI/imitation/pull/779
Bug description
Relevant CircleCI link: https://app.circleci.com/pipelines/github/HumanCompatibleAI/imitation/3903/workflows/84d03683-66fc-416e-865a-27bb86d78349/jobs/15818
Steps to reproduce
Directly run at master head.
pytest tests/algorithms/test_sqil.py::test_sqil_performance_continuous[DDPG]