i found a bug while working with very long sequences (Axolotl). FastaSequenceFile::readSequence increases the size of the internal buffer if the number of bases read so far is equal to the array size (line 177):
if (sequenceLength == bases.length)
Although it is a memory-efficient approach, unfortunately, it runs into problems if the sequence length is even minimally longer than 2^30-1, since then the method tries to allocate an array with more than 2^31-1 elements, which results in the array size being negative. I would suggest to check if the current array size is 2^30 and increment the internal array in smaller steps (say final byte[] tmp = new byte[(int)(bases.length*1.1)] instead of final byte[] tmp = new byte[bases.length*2] or switch to a different data structure, which I imaging, would be quite tedious. As of now I was able to solve that problem as described above for my project, but I admit it's probably not the best solution.
Dear developers,
i found a bug while working with very long sequences (Axolotl). FastaSequenceFile::readSequence increases the size of the internal buffer if the number of bases read so far is equal to the array size (line 177):
if (sequenceLength == bases.length)
Although it is a memory-efficient approach, unfortunately, it runs into problems if the sequence length is even minimally longer than 2^30-1, since then the method tries to allocate an array with more than 2^31-1 elements, which results in the array size being negative. I would suggest to check if the current array size is 2^30 and increment the internal array in smaller steps (sayfinal byte[] tmp = new byte[(int)(bases.length*1.1)]
instead offinal byte[] tmp = new byte[bases.length*2]
or switch to a different data structure, which I imaging, would be quite tedious. As of now I was able to solve that problem as described above for my project, but I admit it's probably not the best solution.Thanks! Sergej