Closed ArnNag closed 2 months ago
…with Layer Norm not working on half precision floats and with memory profiling, but I think this is because I'm running on CPU.
Accidentally opened up this PR on uw-ipd rather than my own fork
…with Layer Norm not working on half precision floats and with memory profiling, but I think this is because I'm running on CPU.