JuliaPOMDP / TabularTDLearning.jl

Julia implementations of temporal difference Reinforcement Learning algorithms like Q-Learning and SARSA
Other
10 stars 6 forks source link

quick fixes #29

Closed WhiffleFish closed 1 year ago

WhiffleFish commented 1 year ago

For benchmarking, solver configurations are the same as in test files, except n_episodes = 1_000.

Q learning

Before

BenchmarkTools.Trial: 749 samples with 1 evaluation.
 Range (min … max):  5.293 ms … 10.363 ms  ┊ GC (min … max): 0.00% … 33.24%
 Time  (median):     6.081 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.674 ms ±  1.403 ms  ┊ GC (mean ± σ):  9.94% ± 14.36%

      ▂▄▅▆▇█▅▄                                                
  ▂▃▄▇██████████▇▄▃▃▂▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▃▃▄▄▆▄▅▅▆▄▄▃▄ ▃
  5.29 ms        Histogram: frequency by time        9.95 ms <

 Memory estimate: 12.36 MiB, allocs estimate: 208948.

After

BenchmarkTools.Trial: 1195 samples with 1 evaluation.
 Range (min … max):  3.196 ms … 9.009 ms  ┊ GC (min … max):  0.00% … 57.34%
 Time  (median):     3.647 ms             ┊ GC (median):     0.00%
 Time  (mean ± σ):   4.181 ms ± 1.529 ms  ┊ GC (mean ± σ):  12.63% ± 17.32%

    ▄▆█▇                                                     
  ▃▆████▇▆▃▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▃▃▃▄▃▃ ▃
  3.2 ms         Histogram: frequency by time       8.62 ms <

 Memory estimate: 8.24 MiB, allocs estimate: 129217.

SARSA

Before

BenchmarkTools.Trial: 678 samples with 1 evaluation.
 Range (min … max):  5.886 ms … 10.119 ms  ┊ GC (min … max): 0.00% … 22.13%
 Time  (median):     6.815 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.369 ms ±  1.094 ms  ┊ GC (mean ± σ):  9.90% ± 11.79%

        ▁▂█▄▂▇▅▂▃▁                                            
  ▂▂▃▄▆▅████████████▅▅▅▄▄▃▃▂▁▂▁▂▁▁▁▁▂▂▁▂▂▃▃▄▆▆▅█▇█▆▆▇▆▆▆▄▃▃▃ ▄
  5.89 ms        Histogram: frequency by time        9.46 ms <

 Memory estimate: 12.55 MiB, allocs estimate: 220650.

After

BenchmarkTools.Trial: 768 samples with 1 evaluation.
 Range (min … max):  5.121 ms … 11.150 ms  ┊ GC (min … max): 0.00% … 40.97%
 Time  (median):     5.900 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.509 ms ±  1.558 ms  ┊ GC (mean ± σ):  9.74% ± 14.55%

     ▃▆██▇▅▇▂▁                                                
  ▃▄▇██████████▇▄▃▂▃▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▃▃▃▄▅▅▅▅▄▃▃▄ ▃
  5.12 ms        Histogram: frequency by time        10.5 ms <

 Memory estimate: 10.26 MiB, allocs estimate: 160562.

SARSA-lambda

Before

BenchmarkTools.Trial: 66 samples with 1 evaluation.
 Range (min … max):  69.143 ms … 82.642 ms  ┊ GC (min … max): 12.24% … 14.93%
 Time  (median):     76.713 ms              ┊ GC (median):    11.49%
 Time  (mean ± σ):   76.025 ms ±  2.978 ms  ┊ GC (mean ± σ):  11.74% ±  1.15%

                              ▅       █  ▅▅   ▂                
  ▅▅▁█▅▅▁▁█▁▁▁▁▁▁▅▁▁▁▅█▅▅▁█▅▅██▁▅▁██▁██▅███▅▅████▅█▅▁▅▁▁▅▁▁▁▅ ▁
  69.1 ms         Histogram: frequency by time        81.4 ms <

 Memory estimate: 160.64 MiB, allocs estimate: 326811.

After

BenchmarkTools.Trial: 86 samples with 1 evaluation.
 Range (min … max):  51.804 ms … 72.022 ms  ┊ GC (min … max): 13.76% … 18.35%
 Time  (median):     58.443 ms              ┊ GC (median):    12.77%
 Time  (mean ± σ):   58.534 ms ±  3.388 ms  ┊ GC (mean ± σ):  14.89% ±  2.56%

                 █▅ ▂   ▂ ▂▂ ▂  ▂▂▂  ▂▂   ▂                    
  ▅▅▁█▁▁▅▁▁▅▁▁▁█▅██▅█▅▅▅█▁████▁▅███▅▁██████▅▅▁▅█▅█▁██▅▅▅▅█▅▁▅ ▁
  51.8 ms         Histogram: frequency by time        64.4 ms <

 Memory estimate: 157.03 MiB, allocs estimate: 263353.
codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +2.66 :tada:

Comparison is base (798f35a) 97.33% compared to head (7809bcf) 100.00%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #29 +/- ## =========================================== + Coverage 97.33% 100.00% +2.66% =========================================== Files 3 4 +1 Lines 150 152 +2 =========================================== + Hits 146 152 +6 + Misses 4 0 -4 ``` | [Impacted Files](https://app.codecov.io/gh/JuliaPOMDP/TabularTDLearning.jl/pull/29?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaPOMDP) | Coverage Δ | | |---|---|---| | [src/TabularTDLearning.jl](https://app.codecov.io/gh/JuliaPOMDP/TabularTDLearning.jl/pull/29?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaPOMDP#diff-c3JjL1RhYnVsYXJURExlYXJuaW5nLmps) | `100.00% <ø> (ø)` | | | [src/q\_learn.jl](https://app.codecov.io/gh/JuliaPOMDP/TabularTDLearning.jl/pull/29?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaPOMDP#diff-c3JjL3FfbGVhcm4uamw=) | `100.00% <100.00%> (+2.22%)` | :arrow_up: | | [src/sarsa.jl](https://app.codecov.io/gh/JuliaPOMDP/TabularTDLearning.jl/pull/29?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaPOMDP#diff-c3JjL3NhcnNhLmps) | `100.00% <100.00%> (+2.12%)` | :arrow_up: | | [src/sarsa\_lambda.jl](https://app.codecov.io/gh/JuliaPOMDP/TabularTDLearning.jl/pull/29?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaPOMDP#diff-c3JjL3NhcnNhX2xhbWJkYS5qbA==) | `100.00% <100.00%> (+3.44%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

zsunberg commented 1 year ago

I hope this was ready to merge