Add polymax with relu2 forward pass (PolymaxQuan)

While in this manifestation this is not a full quantization, taking it in steps, first testing if we can simply replace polymax with relu2 in the forward pass, and results from below suggests that -- at least in postnorm -- that small effect (in fact preliminary results show a slight improvement):

This is with postnorm, but we will have results from pre-norm as well in a few minutes:

Polymax has a tail on the left which is difficult when aggressively quantizing (very high precision needed, so is difficult to capture in int8 or int specifically).

We'll have to continue testing on different types of datasets, but this is a strong indicator that at least the relu^2 part of the quantization is stable, and we can continue iterating to add quantization argparse options for the forward pass.

ReaLLMASIC / nanoGPT

Add polymax with relu2 forward pass (PolymaxQuan) #159