Open stanlivshin opened 9 years ago
Hi, do you mind opening a pull request?
I'd suggest a more proper fix as:
long bufferSize = vocabSize * layerSize * 8;
Preconditions.checkState(bufferSize <= Integer.MAX_INT, "Unable to allocate a buffer size of %s, vocab size is %s, layerSize is %s", bufferSize, vocabSize, layerSize);
DoubleBuffer vectors = DoubleBuffer.allocate(bufferSize);
I ran into this as well. Note that this will still only let you go as big as 16G worth of vectors, and you lose the memory mapping from calling allocateDirect
. It might be better to shard the vectors into 1 or 2 G direct byte buffers, and let the model call in to the correct one.
@dirkgr FYI, side-effect of your efficiency fixes causes the max number of doubles to be 2^28 - 1
, or about 250 million. Google's Google News vector file contains 3 million vectors of 300 entries, or 900 million doubles, and can't be loaded by this new code.
I'm going to look into a fix for this.
Thanks for looking at it. Let me know if you want me to contribute in some way. The limit is the number of doubles you can put into a DoubleBuffer
, right? Because Java can't map more than 2GB of memory at a time?
I don't know if Java can't, but the API for ByteBuffer
only accepts an int
. . . so obviously you're capped at Integer.MAX_VALUE
for what you can build.
See PR #29 @wko27
Seems like this would benefit from using nd4j, if nothing else you could use their DoubleBuffer which supports longs for the length https://github.com/deeplearning4j/nd4j/blob/master/nd4j-buffer/src/main/java/org/nd4j/linalg/api/buffer/BaseDataBuffer.java
If there is interest, I could maybe try it out and submit a pull request. Not sure how you feel about adding that dependency
DoubleBuffer vectors = ByteBuffer.allocateDirect(vocabSize * layerSize * 8).asDoubleBuffer();
this line was throwing error since the int multiplication vocabSize * layerSize * 8 > Integer.MAX_VALUE so negative number was passed into the method.
As a dirty fix i change it to the following:
DoubleBuffer vectors = DoubleBuffer.allocate(1000000000);