Move backwards step into model

fgnt / padertorch

A collection of common functionality to simplify the design, training and evaluation of machine learning models based on pytorch with an emphasis on speech processing.

MIT License

71 stars 16 forks source link

However, we have to consider the implications for the Hook post_step, which is at the moment called after train_step but before the backwards step.

This is currently done to decrease the memory consumption (i.e. after the post step we can delete the input, and the review).

Another point to consider is, that the multi gpu source code must be changed. I don't know if calling backward in a thread is allowed and recommended in pytorch.

I would say, we plan to implement it, when we see a demand for it.

A possible workaround (for those that need it now):

Use single GPU
Just call backward inside the model
Return the last loss and the trainer calls backward on this loss

The code wouldn't be pretty, but it should do the task.

fgnt / padertorch

Move backwards step into model #64