Eclectic-Sheep / sheeprl

Distributed Reinforcement Learning accelerated by Lightning Fabric
https://eclecticsheep.ai
Apache License 2.0
322 stars 33 forks source link

Dreamer division by 0 crash #279

Closed LucaVendruscolo closed 6 months ago

LucaVendruscolo commented 6 months ago

I got this error after the program had been running for about 100,000 steps:

C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\torchmetrics\utilities\prints.py:43: UserWarning: The ``compute`` method of metric MeanMetric was called before the ``update`` method which may lead to errors, as metric states have not yet been updated.
  warnings.warn(*args, **kwargs)  # noqa: B028
C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\torchmetrics\utilities\prints.py:43: UserWarning: The ``compute`` method of metric SumMetric was called before the ``update`` method which may lead to errors, as metric states have not yet been updated.
  warnings.warn(*args, **kwargs)  # noqa: B028
Error executing job with overrides: ['exp=dreamer_v3', 'env=BallGame', 'algo.mlp_keys.encoder=[position,QR_position,ball_speed,QR_speed]', 'algo.mlp_keys.encoder=[position,QR_position,ball_speed,QR_speed]', 'algo.cnn_keys.encoder=[]', 'algo.cnn_keys.decoder=[]']
Traceback (most recent call last):
  File "G:\MazeGameIRLSheepRL - instant- noImage\sheeprl\sheeprl\cli.py", line 352, in run
    run_algorithm(cfg)
  File "G:\MazeGameIRLSheepRL - instant- noImage\sheeprl\sheeprl\cli.py", line 190, in run_algorithm
    fabric.launch(reproducible(command), cfg, **kwargs)
  File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 839, in launch
    return self._wrap_and_launch(function, self, *args, **kwargs)
  File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 925, in _wrap_and_launch
    return to_run(*args, **kwargs)
  File "C:\Users\lucav\Downloads\SheepRLFromScratch\RL17\lib\site-packages\lightning\fabric\fabric.py", line 930, in _wrap_with_setup
    return to_run(*args, **kwargs)
  File "G:\MazeGameIRLSheepRL - instant- noImage\sheeprl\sheeprl\cli.py", line 186, in wrapper
    return func(fabric, cfg, *args, **kwargs)
  File "G:\MazeGameIRLSheepRL - instant- noImage\sheeprl\sheeprl\algos\dreamer_v3\dreamer_v3.py", line 733, in main
    (train_step - last_train) / timer_metrics["Time/train_time"],
ZeroDivisionError: float division by zero
michele-milesi commented 6 months ago

Hi @LucaVendruscolo, can you try this branch? https://github.com/Eclectic-Sheep/sheeprl/tree/fix/timer_metrics

Thanks

LucaVendruscolo commented 6 months ago

Thank you so much👍

belerico commented 6 months ago

Hi @LucaVendruscolo, can you share more details about your experiment? For me it's super strange that Dreamer-V3 has a training time of 0s in any form...

cc @michele-milesi maybe we're missing somenthing else?

michele-milesi commented 6 months ago

I think that the reason it returned 0 is because the model was not trained in between two logging stages (if I remember correctly, @LucaVendruscolo has modified the code to run the train() at every episode end). So I think we can just skip logging the metric.

image

This is a case in which the training is executed the first time. The metric contains the correct value, the second time the compute() method is called, the training was not performed, so the metric returns 0.

belerico commented 6 months ago

OK it makes sense! Thank you