Describe the bug
Even though some parameters are set, they are not fed into the CQL algorithm.
To Reproduce
import d3rlpy
import gym
import d4rl
dataset, env = d3rlpy.datasets.get_dataset("hopper-medium-v2")
cql = d3rlpy.algos.CQL(use_gpu = True,)
def eval_policy(policy):
actions = cql.predict(x)
for t in range(1000000):
cql.fit(dataset, n_steps=1, n_steps_per_epoch=1)
if (t + 1) % args.eval_freq == 0:
eval_policy(cql)
Expected behavior
Only run one step of training, instead, it runs 1953 steps, I wonder where this number comes from???
Use GPU is set to be True, but it was not running with GPU......
Describe the bug Even though some parameters are set, they are not fed into the CQL algorithm.
To Reproduce
Expected behavior Only run one step of training, instead, it runs 1953 steps, I wonder where this number comes from??? Use GPU is set to be True, but it was not running with GPU......
Additional context Ubuntu 18.04 d3rlpy==1.1.1 Python 3.9.12