sequencing / isaac_aligner

Isaac Genome Alignment Software
Other
37 stars 8 forks source link

isaac-align is stuck at "Determining memory capacity for Fastq data" #13

Closed sklages closed 10 years ago

sklages commented 10 years ago

Hi,

isaac-align is stuck at Determining memory capacity for Fastq data :

command:

isaac-align --base-calls /path/to/install/Isaac/data_files/PhiX/isaac_test --base-calls-format fastq-gz --default-adapters Standard --memory-limit 12 --reference-genome /path/to/install/Isaac/data_files/PhiX/iSAACIndex.PhiX.20140901/sorted-reference.xml
2014-09-01 16:20:10     [7f89350aa780]  Forcing LC_ALL to C
2014-09-01 16:20:10     [7f89350aa780]  Version: iSAAC-01.14.07.17
2014-09-01 16:20:10     [7f89350aa780]  argc: 11 argv: isaac-align --base-calls /path/to/install/Isaac/data_files/PhiX/isaac_test --base-calls-format fastq-gz --default-adapters Standard --memory-limit 12 --reference-genome /path/to/install/Isaac/data_files/PhiX/iSAACIndex.PhiX.20140901/sorted-reference.xml
2014-09-01 16:20:10     [7f89350aa780]  Opened fastq stream on /path/to/install/Isaac/data_files/PhiX/isaac_test/lane3_read1.fastq.gz
2014-09-01 16:20:10     [7f89350aa780]  Opened fastq stream on /path/to/install/Isaac/data_files/PhiX/isaac_test/lane3_read2.fastq.gz
2014-09-01 16:20:10     [7f89350aa780]  FastqFlowcellInfo(H0P66AGXX,151:151,[3 ])
2014-09-01 16:20:10     [7f89350aa780]  use bases mask: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyn,yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyn
2014-09-01 16:20:10     [7f89350aa780]  reads parsed: 2
2014-09-01 16:20:10     [7f89350aa780]  Discovered data read: ReadMetadata(1, 150 [1, 150], 0id, 0off,1frc)
2014-09-01 16:20:10     [7f89350aa780]  Discovered data read: ReadMetadata(2, 150 [152, 301], 1id, 150off,152frc)
2014-09-01 16:20:10     [7f89350aa780]  constructed extremity seed SeedMetadata(0, 32, 0, 0)
2014-09-01 16:20:10     [7f89350aa780]  constructed extremity seed SeedMetadata(118, 32, 0, 1)
2014-09-01 16:20:10     [7f89350aa780]  constructed SeedMetadata(32, 32, 0, 2)
2014-09-01 16:20:10     [7f89350aa780]  constructed SeedMetadata(64, 32, 0, 3)
2014-09-01 16:20:10     [7f89350aa780]  constructed extremity seed SeedMetadata(0, 32, 1, 4)
2014-09-01 16:20:10     [7f89350aa780]  constructed extremity seed SeedMetadata(118, 32, 1, 5)
2014-09-01 16:20:10     [7f89350aa780]  constructed SeedMetadata(32, 32, 1, 6)
2014-09-01 16:20:10     [7f89350aa780]  constructed SeedMetadata(64, 32, 1, 7)
2014-09-01 16:20:10     [7f89350aa780]  default-adapters: SequencingAdapterMetadata(AGATCGGAAGAGC,0,0),SequencingAdapterMetadata(GCTCTTCCGATCT,1,0),
2014-09-01 16:20:10     [7f89350aa780]  Generated 'none' barcode: BarcodeMetadata(H0P66AGXX,3,default,none,(0), 4294967295)
2014-09-01 16:20:10     [7f89350aa780]  align: Setting memory limit to 12884901888 bytes.
2014-09-01 16:20:10     [7f89350aa780]  Aligner: adding base-calls path "/path/to/install/Isaac/data_files/PhiX/isaac_test"
2014-09-01 16:20:10     [7f89350aa780]  Determining memory capacity for Fastq data

Here it does not continue. The process is sleeping. strace says:

[...]
write(2, "Determining memory capacity for "..., 42) = 42
write(2, "\n", 1)                       = 1
mmap(NULL, 12884901888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x300cbc000)                        = 0xcb7000
mmap(NULL, 12885037056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
futex(0x7f89329ea620, FUTEX_WAIT_PRIVATE, 2, NULL) = ? ERESTARTSYS (To be restarted)

This is a 64bit Linux system with 256GB RAM. I started with --memory-limit 128 and finally tried --memory-limit 12 without any effects. ulimit -v is unlimited.

Any idea what is going wrong?

best, Sven

rpetrovski commented 10 years ago

Haven't seen anything like this in over 3 years of running iSAAC on a variety of platforms and hardware including Windows. It is really straightforward loop of trying to reserve progressively smaller amounts of RAM until successful.

You can have a look at it in FastqDataSource.cpp the very top function.

Can you please try it on a different box.

ulimit -v is not required as iSAAC uses -m to setrlimit(RLIMIT_AS) but in this case it wouldn't hurt to set it to see if anything changes.

Can you please post the output of cat /proc/version

Roman.

sklages commented 10 years ago

Just trying it on a different machine, Linux 64bit, 512GB RAM. "Determining memory capacity for Fastq data" since ~1,5h .. no progress.

cat /proc/version
Linux version 3.10.29.mx64.54 (root@macheteinfach.molgen.mpg.de) (gcc version 4.7.3 (GCC) ) #1 SMP Tue Feb 11 11:32:49 CET 2014

is the same on both machines.

Anything wrong with the command line? Are you relying on system Boost libs? Maybe there's sth. wrong with boost?

best, Sven

rpetrovski commented 10 years ago

Your strace output suggests that it gets stuck after second attempt to allocate memory. In this part the code uses nothing but std::vector::reserve and try/catch for std::bad_alloc. You should be able to pull the loop out into a simple test app to see if the issue reproduces without the rest of the iSAAC code.

It looks like you are using a custom-compiled kernel. Do you have a box with standard kernel? It does not have to be a beefy box to run phix. Laptop should do. I will not be able to reproduce the problem unless you use a standard distro and kernel that comes with it.

sklages commented 10 years ago

Thanks for your hints. We are running a custom linux in our institute on a few hundred machines. I will try at home on a Ubuntu 64bit notebook with phix and report back. best, Sven

sklages commented 10 years ago

System: Ubuntu 14.04, 64bit, 8G RAM.

cat /proc/version
Linux version 3.13.0-35-generic (buildd@panlong) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014
isaac-align --version
2014-09-09 11:33:06     [7f595d6d57c0]  Forcing LC_ALL to C
2014-09-09 11:33:06     [7f595d6d57c0]  Version: iSAAC-01.14.08.28
2014-09-09 11:33:06     [7f595d6d57c0]  argc: 2 argv: isaac-align --version
iSAAC-01.14.08.28

Running: isaac-align --base-calls /home/sven/Downloads/PhiX/isaac_test --base-calls-format fastq-gz --default-adapters Standard --description 'Erster Isaac Test Run (PhiX)' --memory-limit 4 --output-directory PhiX-Test_Aligned_PhiX_NC_001422 --realign-dodgy 1 --realign-gaps all --reference-genome /home/sven/Downloads/PhiX/iSAACIndex.PhiX.20140903/sorted-reference.xml --scatter-repeats 1 --single-library-samples 1 --temp-directory /home/sven/Downloads/PhiX/isaac_test/temp --verbosity 3 results always in:

[..]
2014-09-09 11:11:07     [7f9c323f87c0]  Generated 'none' barcode: BarcodeMetadata(H0P66AGXX,3,default,none,(0), 4294967295)
2014-09-09 11:11:07     [7f9c323f87c0]  align: Setting memory limit to 4294967296 bytes.
2014-09-09 11:11:07     [7f9c323f87c0]  Aligner: adding base-calls path "/home/sven/Downloads/PhiX/isaac_test"
2014-09-09 11:11:07     [7f9c323f87c0]  Determining memory capacity for Fastq data
2014-09-09 11:11:07     [7f9c323f87c0]  Determining memory capacity for Fastq data done. 9316557 clusters of length 300 will fit
2014-09-09 11:11:07     [7f9c323f87c0]  Opened fastq stream on /home/sven/Downloads/PhiX/isaac_test/lane3_read1.fastq.gz
2014-09-09 11:11:07     [7f9c323f87c0]  Opened fastq stream on /home/sven/Downloads/PhiX/isaac_test/lane3_read2.fastq.gz
[..]
2014-09-09 11:11:12     [7f9c323f87c0]  Selecting matches using 3603160 matches per bin limit
2014-09-09 11:11:12     [7f9c323f87c0]  STAT: AlignWorkflow::selectMatches fragmentStorage 4002705408vm 5354res
isaac-align: /usr/include/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:26: boost::pthread::pthread_mutex_scoped_lock::pthread_mutex_scoped_lock(pthread_mutex_t*): Assertion `!pthread_mutex_lock(m)' failed.
Abgebrochen

and strace tells me:

[..]
clone(child_stack=0x7f065a639f30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f065a63a9d0, tls=0x7f065a63a700, child_tidptr=0x7f065a63a9d0) = 22925
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = -1 ENOMEM (Cannot allocate memory)
futex(0x7fff0e67a310, FUTEX_WAKE_PRIVATE, 1) = 1
write(2, "isaac-align: /usr/include/boost/"..., 153) = 153
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f07556ca000
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
tgkill(22760, 22760, SIGABRT)           = 0
--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=22760, si_uid=1000} ---
+++ killed by SIGABRT +++

As I am not familiar with C++ it is hard to tell why issac refuses to work in my hands :-)

So if the command line itself is OK, I'd be interested in what is going wrong.

best, Sven

rpetrovski commented 10 years ago

Sven, I've just installed latest ubuntu 14.04 (which should be exactly like yours) on my laptop and built iSAAC from source. It worked on a small phix dataset.

Can you please rebuild iSAAC clean from source and post the output capture of the configure script.

Roman.

Here are details of the Ubuntu on which I've tried:

rpetrovski@rooman-laptop3:~/workspace/data/SAAC00629$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.1 LTS Release: 14.04 Codename: trusty

rpetrovski@rooman-laptop3:~/workspace/data/SAAC00629$ cat /proc/version Linux version 3.13.0-35-generic (buildd@panlong) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014

rpetrovski commented 10 years ago

Sven, I have also published the binaries that I've built here: https://illumina.box.com/s/01iyv1k4ih2z1idiyysb

Please tell me if the failure is reproducible with my binaries. This will allow ruling out the differences in build pipeline.

sklages commented 10 years ago

Hi Roman,

thanks for your help. Problem is found and solved (for now). We were using glibc 2.15 on our systems. With this version of glibc memory allocation seems to fail (for unknown reasons). With glibc 2.19 issac-aligner works just fine (at least with a small phix dataset).

Took some time to find out ...

Now I will go on testing/evaluating ... ;-)

best, Sven

rpetrovski commented 10 years ago

Sven, I've tried to compile and run iSAAC on Ubuntu 12.04 (which comes with glibc 2.15). I have also upgraded kernel to 3.10.29 (as in your cluster). I could not get iSAAC to get stuck or crash in the ways you describe. Since you seem to have your issues resolved, I am closing the ticket.

Roman.

Below is my configuration: vagrant@precise64:~$ cat /proc/version Linux version 3.10.29-031029-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201402061535 SMP Thu Feb 6 20:36:31 UTC 2014

rpetrovski@precise64:~$ ldd --version ldd (Ubuntu EGLIBC 2.15-0ubuntu10.7) 2.15 ...

sklages commented 10 years ago

Hi Roman, that's strange :-( Maybe gcc 4.6.3<=>4.7.3? But anyway, thanks for your kind help. At least it works now with updated glibc here on our servers. best, Sven