Closed adamsjohanna closed 1 year ago
You are right, we currently only support a single global buffer and a single unit operator in learning.
So you can have a single unit operator with different learning units.
It might be usefule to create a dict with a buffer per unit operator with learning units? But I am not sure if the learning could work with that, as I am not an expert on this :)
Hi,
I don't really understand how the buffer is supposed to work. We only have one buffer for all learning units right? But if I debug this part with several learning units with different unit operators, the dimensions of the observation, actions and rewards in buffer.py, add() are only given for one unit operator. I think we are missing a collection function over all unit_operators?
Could someone clarify this? Am I supposed to use only one operator for all learning units for now?
Thank you :-)