rlworkgroup / garage

A toolkit for reproducible reinforcement learning research.
MIT License
1.86k stars 310 forks source link

Improve data types #2176

Closed krzentner closed 3 years ago

krzentner commented 3 years ago

This change converts all dtypes to use dataclass, and makes EpisodeBatch a subclass of TimeStepBatch.

This required slightly changing some shapes, to make the types completely consistent.

I also merged and re-wrote the checking code. The error messages now have a more consistent pattern of "\<thing> has \<property> but must have \<correct property> to match \<ground truth>".

All core dtypes have (including EnvSpec) have now also been made immutable.

krzentner commented 3 years ago

I had to reorder a lot of things in _dtypes.py, which makes the total diff much more confusing to read. I've split this change into two commits, the first of which just re-orders _dtypes.py. Please use the diff on the second change and the tests to review.

krzentner commented 3 years ago

Github makes it somewhat hard to access the per-commit diff: it's located here

codecov[bot] commented 3 years ago

Codecov Report

Merging #2176 (0a3ee6a) into master (a24fb7f) will decrease coverage by 0.08%. The diff coverage is 96.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2176      +/-   ##
==========================================
- Coverage   91.38%   91.29%   -0.09%     
==========================================
  Files         198      198              
  Lines       10964    10944      -20     
  Branches     1392     1374      -18     
==========================================
- Hits        10019     9991      -28     
- Misses        688      694       +6     
- Partials      257      259       +2     
Impacted Files Coverage Δ
src/garage/replay_buffer/path_buffer.py 96.00% <ø> (ø)
src/garage/np/_functions.py 77.39% <86.66%> (-0.61%) :arrow_down:
src/garage/_dtypes.py 96.01% <95.79%> (+0.07%) :arrow_up:
src/garage/_environment.py 95.45% <100.00%> (-1.99%) :arrow_down:
src/garage/tf/algos/_rl2npo.py 100.00% <100.00%> (ø)
src/garage/tf/algos/ddpg.py 97.02% <100.00%> (ø)
src/garage/tf/algos/dqn.py 91.52% <100.00%> (ø)
src/garage/tf/algos/npo.py 96.46% <100.00%> (ø)
src/garage/tf/algos/reps.py 98.47% <100.00%> (+<0.01%) :arrow_up:
src/garage/tf/algos/td3.py 96.66% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a24fb7f...0a3ee6a. Read the comment docs.