Open glennhickey opened 5 years ago
It should be more than enough. I wonder if you ran out of disk space though? This is the test case you sent?
On Mon, Jul 8, 2019, 15:23 Glenn Hickey notifications@github.com wrote:
This ran through fine on one sample (HG00514), but when I scaled up to 3 it crashed. The input sequences can be found here:
https://transfer.sh/SZ5pU/hgsvc-chr21-seqs.tar.gz
runs in 40min
./pan-minimap2 hg38_chr21.fa HG00514_chr21_0.fa HG00514_chr21_1.fa HG00733_chr21_0.fa HG00733_chr21_1.fa NA19240_chr21_0.fa NA19240_chr21_1.fa | fpa drop -l 1000 > hgsvc_seqwish_fpa10000.paf
(hgsvc_chr21.fa is the above sequences catted together with hg38 first)
seqwish -s hgsvc_chr21.fa -p hgsvc_seqwish_fpa10000.paf -t 16 -b work/x -g hgsvc_seqwish_fpa10000.gfa
crashes after 7.5 hours
seqwish: /ebs1/seqwish/src/links.cpp:23: void seqwish::derive_links(seqwish::seqindex_t&, size_t, m\ mmulti::map<long unsigned int, long unsigned int>&, mmmulti::map<long unsigned int, long unsigned i\ nt>&, mmmulti::map<long unsigned int, long unsigned int>&): Assertion `v1.size() == v2.size() == 1'\ failed. Command terminated by signal 6
Is it possible that 126G of RAM is not enough?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/18?email_source=notifications&email_token=AABDQEOFAZWDTGPA5E5VSILP6M5TPA5CNFSM4H63U6T2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G53HNHQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AABDQEPMQ2RYY4ZSHEQDH5LP6M5TPANCNFSM4H63U6TQ .
You did the name prefixing awk thing to make sure the sequences are all uniquely named?
On Mon, Jul 8, 2019, 15:25 Erik Garrison erik.garrison@gmail.com wrote:
It should be more than enough. I wonder if you ran out of disk space though? This is the test case you sent?
On Mon, Jul 8, 2019, 15:23 Glenn Hickey notifications@github.com wrote:
This ran through fine on one sample (HG00514), but when I scaled up to 3 it crashed. The input sequences can be found here:
https://transfer.sh/SZ5pU/hgsvc-chr21-seqs.tar.gz
runs in 40min
./pan-minimap2 hg38_chr21.fa HG00514_chr21_0.fa HG00514_chr21_1.fa HG00733_chr21_0.fa HG00733_chr21_1.fa NA19240_chr21_0.fa NA19240_chr21_1.fa | fpa drop -l 1000 > hgsvc_seqwish_fpa10000.paf
(hgsvc_chr21.fa is the above sequences catted together with hg38 first)
seqwish -s hgsvc_chr21.fa -p hgsvc_seqwish_fpa10000.paf -t 16 -b work/x -g hgsvc_seqwish_fpa10000.gfa
crashes after 7.5 hours
seqwish: /ebs1/seqwish/src/links.cpp:23: void seqwish::derive_links(seqwish::seqindex_t&, size_t, m\ mmulti::map<long unsigned int, long unsigned int>&, mmmulti::map<long unsigned int, long unsigned i\ nt>&, mmmulti::map<long unsigned int, long unsigned int>&): Assertion `v1.size() == v2.size() == 1'\ failed. Command terminated by signal 6
Is it possible that 126G of RAM is not enough?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/18?email_source=notifications&email_token=AABDQEOFAZWDTGPA5E5VSILP6M5TPA5CNFSM4H63U6T2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G53HNHQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AABDQEPMQ2RYY4ZSHEQDH5LP6M5TPANCNFSM4H63U6TQ .
The test case I sent the other day was just one sample (hg38 + 2 sequences). This one (I put a new link to the data above) contains those, plus another 4 sequences. I'm working on a disk with 1.6T free space.
I don't do any particular awking, but my sequences have unique names
grep '>' *.fa
HG00514_chr21_0.fa:>HG00514_chr21_0_0
HG00514_chr21_0.fa:>HG00514_chr21_0_1
HG00514_chr21_0.fa:>HG00514_chr21_0_2
HG00514_chr21_1.fa:>HG00514_chr21_1_0
HG00514_chr21_1.fa:>HG00514_chr21_1_1
HG00733_chr21_0.fa:>HG00733_chr21_0_0
HG00733_chr21_1.fa:>HG00733_chr21_1_0
hg38_chr21.fa:>chr21
NA19240_chr21_0.fa:>NA19240_chr21_0_0
NA19240_chr21_1.fa:>NA19240_chr21_1_0
@glennhickey I'm not sure that the fasta reader is going to be OK with the sequences named that way. But I can't be sure that this is the problem. I'll see if I can reproduce with a simpler test.
This ran through fine on one sample (HG00514), but when I scaled up to 3 it crashed. The input sequences can be found here:
https://transfer.sh/SZ5pU/hgsvc-chr21-seqs.tar.gz
Is it possible that 126G of RAM is not enough?