Add trace optimizer to ZnRND

Very nice and clean implementation of your optimizer.

I have one bigger comment to how it re-scales. The rest are just minor things and suggestions.

I responded to that one in some detail. But in general that is a topic of research. All learning rates ignore batch size and this one does the same. We only notice it because we now care about the batch in the training dynamics. However, I can add an argument that allows for scaling over the batch or anything so that we can experiment with this very easily.

zincware / ZnNL

Add trace optimizer to ZnRND #73