Open prashant23 opened 10 years ago
I haven't gotten around to adding a command line interface yet, but you can create files with code like the following:
Writer writer = new FileWriter("yourFile");
for (Customer entity : new CustomerGenerator(scaleFactor, part, numberOfParts)) {
writer.write(entity.toLine());
writer.write('\n');
}
Each table in TPCH has an associated generator, and each generator is an Iterable<TpchEntity>
. Each entity has getters for the individual column values, or you can use the toLine()
to generate a standard TPCH output line.
when I am trying to run the above code , I am getting the following error -
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Unknown Source)
at java.lang.String.
I think it's creating ample amount of garbage collector . I am using windows 7 with RAM of 4GB.
Is there any problem with the RAM ?
Kindly let me know if any workaround is there ?
This code has not been optimized for running in memory constrained environments. I'm sure there is a lot of room for improvement here if you want to take a look at it. Also the latest commits in trunk improve performance and rate of garbage generation, but I'm not sure what the minimum amount of memory to required to run the generator is. I would guess you need at least a few GBs.
I don't know how the JVM chooses the default heap size on Windows, but it might be too small. Try increasing the heap size when running Java:
java -Xms2G -Xmx2G ...
This sets the starting size and maximum size to 2GB, so it will allocate that much memory up front and use a fixed-size heap. You can try using 1G
or 3G
depending on whether or not that works.
Hello developers, I was searching for java utility to generate TPCH data and found your code, Can you tell me how to use this as a API, I was looking for Readme file but i didn't found one.
Thanks and Regards
Prashant