[BUG] Setting parameters in CQL does not work

Describe the bug Even though some parameters are set, they are not fed into the CQL algorithm.

To Reproduce

import d3rlpy
import gym 
import d4rl

dataset, env = d3rlpy.datasets.get_dataset("hopper-medium-v2")
cql = d3rlpy.algos.CQL(use_gpu = True,)

def eval_policy(policy):
    actions = cql.predict(x)

for t in range(1000000):    
    cql.fit(dataset, n_steps=1, n_steps_per_epoch=1)
    if (t + 1) % args.eval_freq == 0:
        eval_policy(cql)

Expected behavior Only run one step of training, instead, it runs 1953 steps, I wonder where this number comes from??? Use GPU is set to be True, but it was not running with GPU......


2022-08-08 12:08.07 [debug    ] RoundIterator is selected.
2022-08-08 12:08.07 [info     ] Directory is created at d3rlpy_logs/CQL_20220808120807
2022-08-08 12:08.07 [debug    ] Building models...
2022-08-08 12:08.08 [debug    ] Models have been built.
2022-08-08 12:08.08 [info     ] Parameters are saved to d3rlpy_logs/CQL_20220808120807/params.json params={'action_scaler': None, 'actor_encoder_factory': {'type': 'default', 'params': {'activation': 'relu', 'use_batch_norm': False, 'dropout_rate': None}}, 'actor_learning_rate': 0.0001, 'actor_optim_factory': {'optim_cls': 'Adam', 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False}, 'alpha_learning_rate': 0.0001, 'alpha_optim_factory': {'optim_cls': 'Adam', 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False}, 'alpha_threshold': 10.0, 'batch_size': 256, 'conservative_weight': 5.0, 'critic_encoder_factory': {'type': 'default', 'params': {'activation': 'relu', 'use_batch_norm': False, 'dropout_rate': None}}, 'critic_learning_rate': 0.0003, 'critic_optim_factory': {'optim_cls': 'Adam', 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False}, 'gamma': 0.99, 'generated_maxlen': 100000, 'initial_alpha': 1.0, 'initial_temperature': 1.0, 'n_action_samples': 10, 'n_critics': 2, 'n_frames': 1, 'n_steps': 1, 'q_func_factory': {'type': 'mean', 'params': {'share_encoder': False}}, 'real_ratio': 1.0, 'reward_scaler': None, 'scaler': None, 'soft_q_backup': False, 'tau': 0.005, 'temp_learning_rate': 0.0001, 'temp_optim_factory': {'optim_cls': 'Adam', 'betas': (0.9, 0.999), 'eps': 1e-08, 'weight_decay': 0, 'amsgrad': False}, 'use_gpu': 0, 'algorithm': 'CQL', 'observation_shape': (20,), 'action_size': 2}
Epoch 1/1: 100%|█████████████████| 1953/1953 [01:32<00:00, 21.17it/s, temp_loss=-1.1, temp=1.07, alpha_loss=13.6, alpha=0.913, critic_loss=-10.9, actor_loss=4.73]
2022-08-08 12:09.41 [info     ] CQL_20220808120807: epoch=1 step=1953 epoch=1 metrics={'time_sample_batch': 0.0028074814366244438, 'time_algorithm_update': 0.04355938167249735, 'temp_loss': -1.0992966500352697, 'temp': 1.0667595759881074, 'alpha_loss': 13.566169590478943, 'alpha': 0.9133139546870941, 'critic_loss': -10.887224008487056, 'actor_loss': 4.734693383637776, 'time_step': 0.046753879699472645} step=1953
2022-08-08 12:09.41 [info     ] Model parameters are saved to d3rlpy_logs/CQL_20220808120807/model_1953.pt

Additional context Ubuntu 18.04 d3rlpy==1.1.1 Python 3.9.12

takuseno / d3rlpy

[BUG] Setting parameters in CQL does not work #205