Closed jmacglashan closed 3 years ago
Hey @jmacglashan thanks for the detailed feature request! Just to make sure I understand correctly, would these two be functionally equivalent?
Using tuple indices:
trajectory = {
"observation": trajectory_writer.history["observation"][(-n-1, -1)],
"action": trajectory_writer.history["action"][-n],
...
}
Without tuple indices. I haven't tested this out, so there is likely a bug somewhere ;):
trajectory = {
"observation": reverb.TrajectoryColumn([
list(trajectory_writer.history["observation"][-n-1])[0],
list(trajectory_writer.history["observation"][-1])[0]
]),
"action": trajectory_writer.history["action"][-n],
...
}
I agree that the second option is quite verbose. I think introducing this indexing should be fine, especially since users will be familiar with it [from numpy]*(https://numpy.org/doc/stable/reference/arrays.indexing.html#integer-array-indexing). I'll discuss this with the rest of the Reverb team and get back to you :)
* I was surprised to find that np.arange(10)[(0, 1)]
doesn't work while np.arange(10)[[0, 1]]
does. It looks like numpy treats tuples differently from lists for indexing I wonder if it's worth only accepting lists for consistency?
Thanks! I think you're right that the second way you posted should be semantically the same. I didn't think of that one because I missed that the iterator of _ColumnHistory
yields pybind.WeakCellRef
, which is different from the __getitem__
which returns TrajectoryColumn
(I assumed they were the same) But yeah, I do think it would be nice QoL to have the multiple index support all the same.
For numpy you're right that the first case fails because it first interprets a tuple as an index for multiple dimensions. You can still use a tuple for multiple indices in one dimension in numpy, but you have to nest it in another 1-d tuple so that it knows you're not defining multiple dimension indices. E.g., this works
np.arange(10)[((0, 1),)]
However, that is ugly :p and I would be perfectly happy with supporting multiple indices via a list in Reverb.
I think this is an excellent idea! Thanks a lot @jmacglashan for the suggestion and the detailed motivation.
I'm sending out a CL to add this feature.
Change is in and this will be included in the next release. Thanks again for the suggestion!
When creating items, you can index into the
TrajectoryWriter.history
with anint
, or a pythonslice
object and have it construct for you a correspondingTrajectoryColumn
. However, it would also be nice if we could index into it with a tuple of ints, much like pulling out indices of a np array or TF tensor which would allow you to select multiple non-contiguous indices.Motivation
The main motivation for this is for cases where you want to do things like N-step transitions. As things are, there are three approaches you can take in Reverb.
TrajectoryColumn
objects around them.The problem with approach 1 is it doubles your memory stored for observations because you write the same observation twice, once when it's the ending observation of a transition and again when it's the beginning observation. This can be bad since observations are typically the most memory expensive field.
The second approach only writes the observation once and instead uses pointers in the table items to the right place, but has the problem that when you read data on a trainer through a dataset, you're always reading N+1 observations when you only need two. This turns out to be a non-trivial sampling performance hit, especially since observations are once again the biggest elements.
The third approach does exactly what we want, but the ergonomics aren't great because it means the user has to do all the history bookkeeping themselves, when the nice thing about the
TrajectoryWriter
is that it's already doing that for you in a nice interface.Proposed feature
To better support this use case, It would be nice if we could slice a
TrajectoryWriter
history with tuple of ints to get/construct the rightTrajectoryColumn
.That is, you could create your
TrajectoryColumn
s from your history with a tuple slice for the observationRight now, I've written code that basically supports this, but requires me to use the private class and properties
_ColumnHistory
which seems bad because being private, later versions of reverb might change how it's organized. I believe the getter for this class, however, can be easily extended to support this with the following change, which then means I don't need external code that mucks about with private classes.