ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
ANTLRInputStream reads in all input into an array. If the array is full, it is copied to another array with twice the size. The initial size is (by default) 1024, a power of two. This causes parsing of files larger than 1 GiB (2^30 bytes, to be exact) to fail: Since there are more than 2^30 bytes, this size is doubled. However, doubling this number results in an overflow, because 2^31 sets the sign bit. So, this ends up trying to create an array of size Integer.MIN_VALUE:
Exception in thread "main" java.lang.NegativeArraySizeException
at java.util.Arrays.copyOf(Arrays.java:3332)
at org.antlr.v4.runtime.ANTLRInputStream.load(ANTLRInputStream.java:123)
at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:86)
at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:82)
at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:90)
Something like the following could help, but this will cause other issues for files larger thanInteger.MAX_VALUE. I guess some exception should be thrown whendata.lengthisInteger.MAX_VALUE`?
--- runtime/Java/src/org/antlr/v4/runtime/ANTLRInputStream.java.orig 2018-02-05 16:16:59.031825397 +0100
+++ runtime/Java/src/org/antlr/v4/runtime/ANTLRInputStream.java 2018-02-05 16:18:15.995627939 +0100
@@ -98,7 +98,10 @@ public class ANTLRInputStream implements
do {
if ( p+readChunkSize > data.length ) { // overflow?
// System.out.println("### overflow p="+p+", data.length="+data.length);
- data = Arrays.copyOf(data, data.length * 2);
+ int newLength = data.length * 2;
+ if (newLength < 0)
+ newLength = Integer.MAX_VALUE;
+ data = Arrays.copyOf(data, newLength);
}
numRead = r.read(data, p, readChunkSize);
// System.out.println("read "+numRead+" chars; p was "+p+" is now "+(p+numRead));
The class uses a char array, so a simple fix would be to use a list of char arrays. Then instead of doubling the size each time, just add the new array to the list.
ANTLRInputStream
reads in all input into an array. If the array is full, it is copied to another array with twice the size. The initial size is (by default) 1024, a power of two. This causes parsing of files larger than 1 GiB (2^30 bytes, to be exact) to fail: Since there are more than2^30
bytes, this size is doubled. However, doubling this number results in an overflow, because2^31
sets the sign bit. So, this ends up trying to create an array of sizeInteger.MIN_VALUE
:Something like the following could help, but
this will cause other issues for files larger than
Integer.MAX_VALUE. I guess some exception should be thrown when
data.lengthis
Integer.MAX_VALUE`?