The LUTs generated according to this code are all saved using float32, and the actual value stored in is also a floating point number (e.g.: 2.9525747) not a fixed-point number, which is 4 times different from the actual storage space calculated in the paper. For example, SPLUT-S (X4) need 22Mb to store the LUTs, not 5.5M listed in the paper.
I wonder if the quantized storage is done when the table is saved, will it have a bigger impact on the performance?
The LUTs generated according to this code are all saved using float32, and the actual value stored in is also a floating point number (e.g.: 2.9525747) not a fixed-point number, which is 4 times different from the actual storage space calculated in the paper. For example, SPLUT-S (X4) need 22Mb to store the LUTs, not 5.5M listed in the paper. I wonder if the quantized storage is done when the table is saved, will it have a bigger impact on the performance?