Closed svm-zhang closed 7 years ago
Hi Simo, Thanks for the report! How much RAM does your machine have? And is the file you're trying to index the size of the entire hg (~3.1Gbp)? Best regards, Ivan.
Hello Ivan,
I was running it on an EC2 instance which has 16 CPUs and 30GB RAM. The segfault showed up when only 40% of the memory was used (according to htop).
The simulated reference contains only the autosomal part of the GRCH38 genome (sex, MT, and decoy sequences were not included).
Thanks for the help!
Simo
Hi Simo, could you by any chance re-run your test on the newest version (v0.4.1) and report whether it works now? It's hard to debug without the concrete dataset :-) If it still does not work, could you by any chance upload your reference somewhere so I could take a look myself?
Best regards, Ivan.
Hello Ivan,
Thanks for looking into this. I pulled the patch and the error persists.
What would be the best way to share you with my reference?
Thanks, Simo
Hello @isovic,
I just sent a google drive link to ivan.sovic@irb.hr. Please look out for that email and let me know if you get it.
Thanks, Simo
Hi Simo, thanks for the link! I got the email and will inspect today! Best regards, Ivan.
For some reason I don't have access to the file - I requested for it now via Gdrive. Will take a look as soon as you approve the access!
Thank you! Ivan
Hello @isovic,
I just approved the access. Please give a check and let me know if it works.
Thanks, Simo
Ok, got it now, thanks! Will have a look. Ivan
The seg-fault also occurs when running graphmap align and it needs to build an index:
[16:03:06 Index] Running in normal (parsimonious) mode. Only one index will be used.
[16:03:06 Index] Index is not prebuilt. Generating index.
[16:03:06 LoadOrGenerate] Started generating new index from file 'ref/hg38.fa'...
Segmentation fault
I just
make debug
gdb --args ./bin/graphmap-debug align -r ref/hg38.fa -d all_2d.fastq -o aligned/all_2d.sam
and got the seg-fault here:
Using host libthread_db library "/lib64/libthread_db.so.1".
[16:25:28 Index] Running in normal (parsimonious) mode. Only one index will be used.
[16:25:28 Index] Index is not prebuilt. Generating index.
[16:25:28 LoadOrGenerate] Started generating new index from file 'ref/hg38.fa'...
Program received signal SIGSEGV, Segmentation fault.
0x00000000004d2429 in IndexSpacedHashFast::CreateIndex_ (this=0x890940, data=0x7ffdbdb02010 'N' <repeats 200 times>..., data_length=6176539712) at src/index/index_spaced_hash_fast.cc:520
520 kmer_hash_array_[hash_key][kmer_countdown[hash_key]] = coded_position;
Missing separate debuginfos, use: zypper install libgomp1-debuginfo-6.2.1+r239768-2.4.x86_64 libz1-debuginfo-1.2.8-6.3.1.x86_64
(gdb) where
#0 0x00000000004d2429 in IndexSpacedHashFast::CreateIndex_ (this=0x890940, data=0x7ffdbdb02010 'N' <repeats 200 times>..., data_length=6176539712) at src/index/index_spaced_hash_fast.cc:520
#1 0x00000000004ecb7c in Index::GenerateFromSequenceFile (this=0x890940, sequence_file=...) at src/index/index.cc:81
#2 0x00000000004ec73f in Index::GenerateFromFile (this=0x890940, sequence_file_path=...) at src/index/index.cc:47
#3 0x00000000004d5887 in IndexSpacedHashFast::LoadOrGenerate (this=0x890940, reference_path=..., out_index_path=..., verbose=true) at src/index/index_spaced_hash_fast.cc:1086
#4 0x0000000000540b1b in GraphMap::BuildIndex (this=0x7fffffffbf00, parameters=...) at src/graphmap/graphmap.cc:204
#5 0x000000000053e023 in GraphMap::Run (this=0x7fffffffbf00, parameters=...) at src/graphmap/graphmap.cc:39
#6 0x0000000000578e6b in main (argc=8, argv=0x7fffffffc138) at src/main.cc:70
Hope that helps :)
So it looks like you don't like Ns, right?
Markus
It helps a lot actually, thanks for the traceback! I was just running gdb on Simo's reference to get the same.
So it looks like you don't like Ns, right?
Haha no I don't :-) I'm skipping those. But before I never had trouble on hg, curious what's going on now. Need to refactor this one, as well as some other pieces of code.
Best regards, Ivan.
I am also getting the same exact problem with hg19, any updates would be greatly appreciated!
Hi all, thank you for reporting this and for your patience! I've re-implemented the entire index, and this issue should no longer occur in the future, but the fix will be included in the next release which is coming soon (within a week, hopefully), together with some new goodies such as speed improvements on larger references. Best regards, Ivan.
Hello Ivan,
This is great news! Looking forward to this new release.
cheers, Simo
As excited as Simo for the update. Thank you ivan!
Hi everyone,
there have been many updates and changes, and in the latest version I (hopefully) addressed all of the above issues. Would you mind giving it a spin to verify if everything is well now?
Best regards, Ivan.
Hello Ivan,
I am testing it now and will report shortly. Thanks very much for all the updates!
Simo
The indexing went very smooth. All problems solved :)
A side note, I was running the indexing on EC2 instance which gave the original error. But yesterday I was running it on Google cloud (n1-standard-32) using the v0.4.1, and no seg fault was issued. Weird!
Any ideas?
Simo
Hi Simo, Thanks for the report! It was hard to pinpoint exactly what was going on, and it manifested mostly on larger references. This made it difficult to debug, and after a while, I simply decided to implement a new index with all these great new features. I would advise using the newest version of GraphMap instead of 0.4.x.
Best regards, Ivan.
I'd say great decision on implementing the new index! Thanks!
Simo
Hello @isovic, I am closing this thread. In case @windybasket finds new problem, he/she can reopen this.
Thanks!
Hello Ivan,
I was trying to build the index for my simulated GRCH38 genome using the command line:
graphmap align -I -r grch38.simu.fasta
The index ran about 7 minutes before issued seg fault error (see the file attached below).
graphmap.index.err.txt
I also tried to use .fa as the reference extension and with/without reads file, and the error persisted.
I am using graphmap v0.3.2.
Any insights on how I could fix this?
Thanks, Simo