Segmentation fault on large datasets

Description

Using an 11G reads.csv file of junction reads, there's a segmentation fault. This is for ~300 samples and I expect datasets to be much bigger (10,000s) so failing at this level is unacceptable.

Traceback (most recent call last):
  File "/home/obotvinnik/anaconda/bin/outrigger", line 9, in <module>
    load_entry_point('outrigger', 'console_scripts', 'outrigger')()
  File "/home/obotvinnik/workspace-git/outrigger/outrigger/commandline.py", line 1033, in main
    cl = CommandLine(sys.argv[1:])
  File "/home/obotvinnik/workspace-git/outrigger/outrigger/commandline.py", line 334, in __init__
    self.args.func()
  File "/home/obotvinnik/workspace-git/outrigger/outrigger/commandline.py", line 346, in psi
    psi.execute()
  File "/home/obotvinnik/workspace-git/outrigger/outrigger/commandline.py", line 948, in execute
    junction_reads = self.csv()
  File "/home/obotvinnik/workspace-git/outrigger/outrigger/commandline.py", line 471, in csv
    low_memory=self.low_memory)
  File "/home/obotvinnik/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py", line 645, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/obotvinnik/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py", line 400, in _read
    data = parser.read()
  File "/home/obotvinnik/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py", line 938, in read
    ret = self._engine.read(nrows)
  File "/home/obotvinnik/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py", line 1507, in read
    data = self._reader.read(nrows)
  File "pandas/parser.pyx", line 849, in pandas.parser.TextReader.read (pandas/parser.c:10387)
  File "pandas/parser.pyx", line 937, in pandas.parser.TextReader._read_rows (pandas/parser.c:11556)
  File "pandas/parser.pyx", line 2018, in pandas.parser.raise_parser_error (pandas/parser.c:26979)
pandas.io.common.CParserError: Error tokenizing data. C error: out of memory
/var/spool/torque/mom_priv/jobs/7217194.tscc-mgr.local.SC: line 15: 17589 Segmentation fault      outrigger psi

real    0m41.556s
user    0m9.573s
sys     0m2.458s

Steps to Reproduce

Use a deeply sequenced dataset of ~300 samples
outrigger index runs fine
outrigger psi dies when reading in the junction reads.csv from outrigger index.

Expected behavior: Did not expect a segfault (run out of memory)

Actual behavior: Ran out of memory on a 64GB memory supercomputer.

Versions

Linux x64, outrigger version 1.0.0rc1

YeoLab / outrigger

Segmentation fault on large datasets #64

Description

Steps to Reproduce

Versions