google-deepmind / reverb

Reverb is an efficient and easy-to-use data storage and transport system designed for machine learning research
Apache License 2.0
704 stars 92 forks source link

Support for varying in length Observations and Actions #47

Closed MikeJanks closed 3 years ago

MikeJanks commented 3 years ago

Add support for dynamic observations and actions. I am working on RL algorithms that can add new actions and allow for multiple observations. I would really appreciate if Reverb could support this.

acassirer commented 3 years ago

Hey,

The shapes of data in the same column has to stay the same for Reverb to be able to concatenate data into chunks. This is very fundamental for the design and it is unlikely to change for the foreseeable future.

That being said, I'm pretty sure you can achieve what you are trying to do with a little bit of hackery. Basically, as I already said, the shapes within a single column has to remain constant, but the number of columns can change. You could therefore do something along the lines of this:

import reverb
import numpy as np

server = reverb.Server([reverb.Table.queue('queue', 100)])
client = reverb.Client(f'localhost:{server.port}')

actions = [
    np.ones([1]),
    np.ones([2, 2]),
    np.ones([3, 3, 3]),
]
observations = [
    np.ones([3]),
    np.ones([4, 4]),
    np.ones([3]),
]

# Note this only works if the sequence lenght is 1 as you would be able to
# concatenate tensors of different shapes anyway.
with client.trajectory_writer(1) as writer:
  for action, observation in zip(actions, observations):
    writer.append({
        'action': {str(action.shape): action},
        'observation': {str(observation.shape): observation},
    })
    trajectory = {
        'action': writer.history['action'][str(action.shape)][-1],
        'observation': writer.history['observation'][str(observation.shape)][-1],
    }
    writer.create_item('queue', 1.0, trajectory)

for sample in client.sample('queue', 3):
  data = {
      'action': sample[0].data[0],
      'observation': sample[0].data[1],
  }
  print(data)

Doing the same efficiently with a dataset is a bit trickier since the dataset require shapes to be specified up front. One idea would be to flatten and pad the data before inserting it alongside the actual shape and then use this to unpack, slice and reshape the data on the receiver side.

ebrevdo commented 3 years ago

@acassirer Datasets do not require shapes to be specified up front. Do you mean in the dataset signatures? There you can provide TensorShapes that have None for some dimensions, corresponding to the fact that the dataset may emit tensors of unknown shape in the given dimensions.

ebrevdo commented 3 years ago

@acassirer we could additionally consider support for RaggedTensors if we get enough requests for it. The logic for concat in tensor-space isn't pretty; so we'd have to work around this in C++-land.

Anyway, your best bet right now is to pick a maximum shape and pad/crop to it. You can use a CSR representation (so fixed max# of entries and store the offsets/values as 3 columns)

acassirer commented 3 years ago

@ebrevdo Yes that is exactly what I meant. Thanks for correcting me.

I suspect that we might want consider adding support for RaggedTensors but let's leave that for the future for now.

fastturtle commented 3 years ago

Closing this issue. If you're interested in RaggedTensor support please raise a separate issue for that.