google-research / batch_rl

Offline Reinforcement Learning (aka Batch Reinforcement Learning) on Atari 2600 games
https://offline-rl.github.io/
Apache License 2.0
520 stars 73 forks source link

Raw results #11

Open n17s opened 3 years ago

n17s commented 3 years ago

To facilitate comparison with a method we are developing, is it possible to release raw results (e.g. similar to dopamine json files?)

These data already "exist" as part of your figures in the appendix of your paper, so what we really want is to produce similar figures (comparing our method with your method) without having to rerun yours from scratch.

agarwl commented 3 years ago

Yes, I can release the raw results and I would try to do so by the end of this week. Which results do you specifically need?

n17s commented 3 years ago

Thank you! We are looking to compare with offline REM and offline QR DQN on 1%, 10%, 20% and 100% of data for the following games: breakout, seaquest, pong, asterix, and qbert.

pmineiro commented 3 years ago

To clarify:

These particular comparisons are becoming popular following the QR-DQN paper.

agarwl commented 3 years ago

I think you meant the CQL paper? Actually, I can send you these results directly over email now (as I have them stored as zipped panda dataframes) -- can you please write an email to rishabhagarwal@google.com ?

n17s commented 3 years ago

Resolved offline

GoingMyWay commented 3 years ago

Yes, I can release the raw results and I would try to do so by the end of this week. Which results do you specifically need?

Dear @agarwl, will you provide the raw results? It would be great if you can provide since retaining REM takes many days and computation power.

agarwl commented 3 years ago

@GoingMyWay Yes, I'll post the raw results on github by next month. In the meantime, you can send me an email and I can send you some of those results (for the setting requested above)

agarwl commented 1 year ago

I forgot about this but here are the raw results (as someone requested them again recently). This might be useful for people stumbling upon this in the future.

QR-DQN (10% data)

Asterix 1293.8620483398402 Breakout 61.84913024902001 Pong 12.650765800479999 Qbert 9420.50625 Seaquest 353.07070770264

REM (10% data)

Asterix 3912.2522460937203 Breakout 56.91960 Pong 9.52690958976 Qbert 5799.877001953099 Seaquest 3643.4553710937603

QR-DQN (1% data)

Asterix 359.78555908202 Breakout 6.8403110503999995 Pong -14.56116828918 Qbert 155.90773391724002 Seaquest 250.[120996093

REM (1% data)

Asterix 363.27997436524 Breakout 4.46266336442 Pong -20.81458015444 Qbert 160.09661407469997 Seaquest 370.47359313964

The file below also contains the raw scores for 20% data (2M corresponds to 1%, 20M to 10% and 40M to 20%). uniform.zip