Open GiggleLiu opened 4 years ago
[ ] As the paper states at the end of page 2, it is not possible to rigorously reverse operations on floating point numbers. However, there is no further discussion of the implications of this. It would be good to have some further reassurance that the errors are not important, or to have a discussion of what applications are acceptable and sufficiently tolerant of the errors.
[x] Given the lack of checkpointing required for automatic differentiation, it would seem to me that the language can enable significantly lower memory usage than non-reversible languages. So I was surprised that there are no benchmarks discussing memory usage, or showing that the method can operate with larger datasets/parameters on a fixed memory budget.
[ ] ICLR is about machine learning, but there is no evaluation of machine learning workloads. For example, I think the ICLR audience would be quite interested in how such a system might enable very deep neural networks.