Ablations - Githubissues

boardlaw's grown like a ball of mud these past few months, and there are likely complexities that aren't adding much and could be thrown away.

League: this is the top of the list. It's a lot of complexity and I haven't actually seen any conclusive evidence of cyclic behaviour (but maybe that's because I've been using the league?)
LR warmup: right now the LR gets warmed up from zero over the first ~hundred steps to it's (very high) 1e-2. Is this necessary? Is the 1e-2 actually any better than 3e-4?
Replay buffer: the biggest deviation of this implementation from classic AZ is the lack of a replay buffer. I did this based on OA5's appendix on stale samples. Was this a good choice? Can emulate a replay buffer easily enough by upping the buffer size, ties into hyperparam tuning.

andyljones / boardlaw