Open piccolbo opened 9 years ago
looked into alternative to write to a file first. Looks like there's two pull requests available here
https://issues.apache.org/jira/browse/SPARK-4131
it's targeted for 1.5 and unresolved
There are some options to the thriftserver and some spark configuration properties that are relevant.
options --driver-memory 1G --executor-memory 2G
Property
spark.kryoserializer.buffer.max.mb 128
I managed to collect up to half of the flights table, at the breakneck speed of 150K data points per second
Not always but the thiftserver can terminate if collect fails this way
given this and problems with #20 it may preferable to use insert overwrite local directory
and read from there (not supported targeted for 1.5). Adds one read and one write, but beats the totally broken fetch any time.