Further batching - Githubissues

Zartris commented 1 year ago

Hi, im new to this library so it might be hidden very will in the docs. But never the less, I am training on a custom gym environment but for the same of recreation lets just take the MPC example given in the docs: https://docs.evotorch.ai/v0.4.1/examples/notebooks/reacher_mpc/

Gym allow us to run multiple environments at the same time while everything is being batched. This adds the extra dimension of world_dim into observations and actions. however I cannot find a way for evotorch to allow an extra batch dimension and stuck with the (popsize, solution_size) and (popsize, 1) for evaluation size. Is there a way to expand these to (popsize, world_dim, solution_size) and (popsize, world_dim, 1)?

engintoklu commented 1 year ago

Hello Zartris!

Sorry for my delayed reply, and thank you for using EvoTorch!

In EvoTorch, continuous optimization algorithms like CEM, PGPE, XNES etc. are implemented to work with problem objects with numeric dtypes (e.g. with dtype=torch.float32). A Problem object with such a numeric dtype assumes that each solution is represented by a 1-dimensional vector. Therefore, the tensor contained by a SolutionBatch is strictly shaped (number_of_solutions, solution_length), without a built-in functionality for extra batching.

That said, I wonder if it would be a valid solution in your case to just take the batch of flat solution vectors, and unflatten them within the fitness function to satisfy the batched interface of your internally stored simulator, like this:

from evotorch import Problem, SolutionBatch

class BatchedPlanningProblem(Problem):
    ...

    def _evaluate_batch(self, batch: SolutionBatch):
        batched_plan = batch.values.reshape(
            batched_shape_expected_by_the_simulator
        )

        # Interaction with the batched simulator goes here
        ...

        # If `fitnesses` is a variable returned by the simulator,
        # it could be flattened back and registered into the batch now:
        batch.set_evals(fitnesses.reshape(self.solution_length))

Is this a relevant solution for your case?

Outside the context of MPC, if you wish to perform direct policy search on a vectorized gym environment, perhaps you could use VecGymNE?

Feel free to let me know if you have further questions.

Happy coding!

Higgcz commented 11 months ago

Closing due to inactivity, but feel free to reopen if you would like to discuss further.

nnaisense / evotorch

Further batching #79