lifelong-learning-systems / rlblocks

Reinforcement Learning Blocks for Researchers
MIT License
0 stars 0 forks source link

Transition components named differently compared to tella #37

Closed ginoperrotta closed 2 years ago

ginoperrotta commented 2 years ago

In tella, we used "observation" as the first component of a transition, but in rlblocks it is called "state." Either is fine, but currently the type-hints are incompatible with methods that actually handle experiments run through tella. (See https://github.com/lifelong-learning-systems/rlblocks/blob/a4ee91404f8480b9be27bf71114bbc0118d57f15/rlblocks/replay.py#L153)

cash commented 2 years ago

gym uses observation, right? Seems like we should use that in rlblocks. @coreylowman

coreylowman commented 2 years ago

Yeah let's move to observation. Wish gym used state, because all the literature uses state lol.

ginoperrotta commented 2 years ago

I'd guess gym aims to be used for non-MDP environments also. Observation is correct in partially-observable cases where state would not be. It would not be that confusing to have used state anyways, but I can see why observation was selected.

ginoperrotta commented 2 years ago

I'm preparing a pull request for this and found that TorchBatch is defined in both replay and torch_blocks. Any preference on which I keep (or move it to base to match the import patterns used so far)?

coreylowman commented 2 years ago

Either is fine, I lean slightly more towards replay since thats where they are generated?