Open MohammadHomsee opened 1 year ago
BigInteger might be your answer!
You could try packing the data into 128 bit integers and use buffer.blockcopy to convert it into the format your model uses, but other than that, there's no way to get around the limitation. You'll just have to use a smaller model or try some other method.
For machine learning, since there is a max of 1024 tokens, the best data storage medium is 64 bits (the ways of getting around the limit have been disallowed or never worked in the first place.). The NNUE model uses 64 (king square)64 (piece square)5 (piece type minus king)*2 (piece color) = 40960 inputs (assuming you use the same weights for each color and rotate the board). That's 10240 64 bit values for just the inputs. You can simplify this (and weaken the model) by removing the king square, for 160 64 bit values for the input neurons. I haven't trained a model to see how good one would be with those few inputs, but in theory that's the easiest way for it to work.
Maybe it could be possible to store numbers in some other approximate form, which would be super heavy on the theory but possible for finding some sort of tradeoff of extra neurons for accuracy. Good luck if anyone tries to do this.
The NNUE model uses 64 (king square)64 (piece square)5 (piece type minus king)*2 (piece color) = 40960 inputs (assuming you use the same weights for each color and rotate the board). That's 10240 64 bit values for just the inputs.
That is one way, yes, but plenty of NNUE engines (most, even) just use flat 12 pieces * 64 squares inputs, or some bucketing system with a lot fewer than 64 king buckets. Still a lot of weights, but there's absolutely no need for 64 buckets, or even any buckets at all.
Is this really an issue with the code?
I would like to use machine learning in my approach, but how I can I store learning data considering token limitation ?