Closed egoltsman closed 4 years ago
What seqwish command did you use? Could you also share the input sequences. Usually this results from a problem with the naming in the FASTA file.
On Mon, Aug 24, 2020, 20:13 Eugene Goltsman notifications@github.com wrote:
Hi, Moving this ticket here from vgteam/vg#2957 https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.
Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip https://github.com/ekg/seqwish/files/5119521/Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQELVI57SMTRUBHZYP4DSCKUS5ANCNFSM4QJYXNUQ .
I mean that a mismatch in the naming in the PAF versus the FASTA could cause this. Another possibility is the -k setting could filter out all the matches in the alignments. Otherwise, it should not be possible so I'd appreciate the full test case for debugging.
On Tue, Aug 25, 2020, 10:23 Erik Garrison erik.garrison@gmail.com wrote:
What seqwish command did you use? Could you also share the input sequences. Usually this results from a problem with the naming in the FASTA file.
On Mon, Aug 24, 2020, 20:13 Eugene Goltsman notifications@github.com wrote:
Hi, Moving this ticket here from vgteam/vg#2957 https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.
Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip https://github.com/ekg/seqwish/files/5119521/Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQELVI57SMTRUBHZYP4DSCKUS5ANCNFSM4QJYXNUQ .
There is another possibility. If your input sequences are lower case. I thought we had added a warning for this case though.
On Tue, Aug 25, 2020, 10:26 Erik Garrison erik.garrison@gmail.com wrote:
I mean that a mismatch in the naming in the PAF versus the FASTA could cause this. Another possibility is the -k setting could filter out all the matches in the alignments. Otherwise, it should not be possible so I'd appreciate the full test case for debugging.
On Tue, Aug 25, 2020, 10:23 Erik Garrison erik.garrison@gmail.com wrote:
What seqwish command did you use? Could you also share the input sequences. Usually this results from a problem with the naming in the FASTA file.
On Mon, Aug 24, 2020, 20:13 Eugene Goltsman notifications@github.com wrote:
Hi, Moving this ticket here from vgteam/vg#2957 https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.
Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip https://github.com/ekg/seqwish/files/5119521/Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQELVI57SMTRUBHZYP4DSCKUS5ANCNFSM4QJYXNUQ .
This was the seqwish command:
~/utils/seqwish/bin/seqwish -t 12 --seqs=Ref.Bd1_1.Bd21_3.Bd30_1CHR1.fa --paf-alns=Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.paf --gfa=Ref.Bd1_1.Bd21_3.Bd30_1CHR1.edyeet.seqwish.take2.gfa
I verified that the fasta headers match the names in the paf file and that the sequences are uppercase.
The full data set is too big to attach here - is there a place where I could upload it?
I realized my seqwish install was rather old so I'm rebuilding from the latest commit now. I'll let you know if the same issue persists.
Hmm.. Having trouble building the latest version. Cmake is complaining about the ips4o library.
I'm running:
gcc version 8.3.0 20190222 (Cray Inc.) (GCC)
cmake version 3.10.2
[ 88%] Building CXX object CMakeFiles/seqwish.dir/src/sxs.cpp.o
In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:7:
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp: In member function 'void mmmulti::map<Key, Value>::sort(int)':
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp:327:16: error: 'ips4o::parallel' has not been declared
ips4o::parallel::sort((std::pair<Key, Value>*)buffer.data,
^~~~~~~~
In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:8:
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp: In member function 'void mmmulti::iitree<S, T>::sort(int)':
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp:334:16: error: 'ips4o::parallel' has not been declared
ips4o::parallel::sort((Interval*)buffer.data,
^~~~~~~~
Looks like a problem with the submodules. Try a fresh recursive clone.
I actually didn't realize that you can write = in the command line flags and have it still work. I'm usually using -p -s and -g to indicate the inputs and outputs.
But, I think this isn't the problem. Did you run minimap2 with -c? Are you sure that the alignments have CIGAR strings?
There should definitely be a warning in this case to avoid future confusion.
On Tue, Aug 25, 2020, 21:06 Eugene Goltsman notifications@github.com wrote:
Hmm.. Having trouble building the latest version. Cmake is complaining about the ips4o library. I'm running: gcc version 8.3.0 20190222 (Cray Inc.) (GCC) cmake version 3.10.2
[ 88%] Building CXX object CMakeFiles/seqwish.dir/src/sxs.cpp.o In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:7: /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp: In member function 'void mmmulti::map<Key, Value>::sort(int)': /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp:327:16: error: 'ips4o::parallel' has not been declared ips4o::parallel::sort((std::pair<Key, Value>)buffer.data, ^
~~~ In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:8: /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp: In member function 'void mmmulti::iitree<S, T>::sort(int)': /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp:334:16: error: 'ips4o::parallel' has not been declared ips4o::parallel::sort((Interval)buffer.data, ^~~~— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-680215142, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQENXE4CCVFN57SY5DITSCQDR3ANCNFSM4QJYXNUQ .
Ahh, Sorry this alignment was based on edyeet.
Here I suspect that the problem is that older seqwish didn't handle the extended cigar that I'm writing from edyeet. Let's try to get the seqwish repo and build updated and test.
Same build errors after doing a fresh clone. Looks like a namespace issue wrt ips4o, but that's an uneducated guess. Just to be on the same page, what cmake and gcc versions are you using in your environment?
I've used anything above gcc v7.5 without issue. Both Debian and Ubuntu, and also some guix builds have worked fine. What versions do you have?
On Wed, Aug 26, 2020, 20:05 Eugene Goltsman notifications@github.com wrote:
Same build errors after doing a fresh clone. Looks like a namespace issue wrt ips4o, but that's an uneducated guess. Just to be on the same page, what cmake and gcc versions are you using in your environment?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-681037234, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEIVPGWYKN4QH36NMYLSCVFFFANCNFSM4QJYXNUQ .
I've tried 8.3 and 9.2 and different cmake versions - same error. It wouldn't be possible to get a static binary for this package, would it? Unfortunately, we can't run docker on our system due to security restrictions.
I would really like to figure out what's going on rather than attempting to provide a static binary. There could be other incompatibilities there.
I suspect your system is missing dependencies that would enable these flags in ips4o:
Usually, the -pthread flag to gcc should define _REENTRANT. But, it may be that this is not the correct flag on your system. You could also try adding -fopenmp to the same line in the CMakeLists.txt. Perhaps that would enable the namespace definition and resolve the problem.
This seems relevant:
https://lists.debian.org/debian-devel/2003/10/msg01538.html
On Thu, Aug 27, 2020 at 9:02 AM Eugene Goltsman notifications@github.com wrote:
I've tried 8.3 and 9.2 and different cmake versions - same error. It wouldn't be possible to get a static binary for this package, would it? Unfortunately, we can't run docker on our system due to security restrictions.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-681648974, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEKR7MCA2QZLBUWYXD3SCYAILANCNFSM4QJYXNUQ .
It looks like I have _REENTRANT defined if -pthread is passed:
gcc -x c /dev/null -E -dM -pthread | grep REENT
..but I don't see this flag being included anywhere in the CMakeLists. Or do you mean that I should add it to CMAKE_CXX_FLAGS ?
Ok, I did just that and it worked! I will now try building the graph again. Will keep you posted.
Now I'm getting a Seg Fault with this data. I've verified that the fasta is properly formatted, but perhaps you could try it on your end. I posted both the fasta and the .paf file i get from edyeet to our portal: https://portal.nersc.gov/dna/plant/assembly/Bdistachyon/
Again, the commands were: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X Ref.Bd1_1.Bd21_3.Bd30_1CHR1.fa Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.fa > Ref.Bd1_1.Bd21_3.Bd30_1CHR1.paf
seqwish -t 32 --seqs=Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.fa --paf-alns=Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.paf --gfa=graph.gfa
Thanks!
Hi, I am still suspecting my edyeet output is not formatted properly (although I'm using the latest build). The CIGAR string doesn't look right. At first I suspected some gsl version issue, but it doesn't look like this is gsl's domain. Is this alignment entry correctly formatted?
Bd1_1_CHR1:9968681-11301495 1332815 100000 150000 + Bd21_3_CHR1:9803073-11041285 1238213 100672 150471 49230 50000 19 id:f:0.986613 ma:i:49230 mm:i:423 ni:i:344 nd:i:146 ns:i:3 ed:i:916 al:i:50146 se:f:0.0182667 cg:Z:3I3=1I1=5I1=2I1=1I1=3I1=2I1=7I1=1
I36=4I6=1X6=1I54=5X1D65=1X44=1X94=1X113=1X63=1X317=1X1=1X213=1X283=2D81=1X58=1X1=1X27=1X80=1X8=1I1=4I1=1I1=9I1=9I1=3I2=1I4=1I2=2I1=2I6=2I1=2I1=1I1=11I2=6I1=1I2=2I2=8I2=3I1=3I23=2X28=1X87=1X78=1X20=2D42=1I19=1X1=1X31=1X24=10I3=3I1=9I1=1I1=2I1=4I2=2I2=1I1=2I1=1I26=1X25=1X319=1X243=2I92=1X49=1X2=1X10=1X29=1X52=1X24=1X11
=1X3=8D12=1D1=1D21=3D6=1X17=1X56=1X23=1X37=2X105=1X10=1X172=1X88=1X23=1X1=1X27=1X16=1X13=1X6=1X8=1X4=1X7=1X15=1I11=1X7=1X70=1X12=1X33=1X65=8D1154=1X44=1I15=1I16=1X202=1X187=4I1=1I1=10I1=7I1=2I1=7I3=1I1=5I1=1I1=1I1=1I1=4I2=5I1=5I4=1I2=2I2=2I1=2I1=1I1=2I1=2I1=1I3=5I4=7I5=7I2=5I1=2I1=2I1=6I1=1I790=1D51=1D74=1X426=1X169=
1X37=1X82=1X41=1X3=1X54=1X8=1X54=1X1I46=1X41=1X26=1X28=1X18=1X14=1X4=1X17=1X40=1X12=1X112=1X22=1X70=1X57=1D279=1X70=1X15=1X24=1X33=3D35=2D7=1D17=1X9=1X2=2D1=1D1=3D6=1X12=1X88=1X28=1X108=1X65=1X61=1X22=1X8=1X94=1X154=1X28=1X103=1X5=1X103=1X1I7=1D143=1X39=1X47=1X33=2X130=1X55=1X122=1I1=1I27=1X38=1X217=1X7=1X1=1X170=1X2
98=1D132=6D231=1X6=2X250=1X94=1X112=1X474=1X331=1X19=1X111=1X54=1X350=1X372=1X49=1X386=1X275=1X263=1X259=1X2=1X9=1D98=1X117=1X11=1X93=1X25=1X26=1X1=1X8=1X1I5=1I21=1X24=1X98=1X13=1X16D41=1I376=1X200=1X10=1X179=1X10=1X37=1X102=1D1=1D1=1X1=1X19=1X148=1X22=1X69=1X12=1X43=1X31=1X8=1X1=1X7=1X6=1X42=1X61=1I21=1X8=1X84=1X41=
1X38=1X120=2X28=1X28=1X403=1I8=1X21=1X29=1X34=1X33=1X21=2I76=1X126=1X19=1X28=1X45=11I50=1X17=1X9=1X38=1X81=1X11=1X36=1X94=2X18=1X56=1X42=1X49=1X107=1X9=1X14=1X31=1X4=1X6=1X11=1X9=1X27=2D3=1D1=10D1=6D1=3D1=1D1=1D1=1D3=1D1=1D1=5D2=3D1=2D1=1D2=1D1=1D2=4D1=2D1=3D1=1D152=1X173=1X258=1X41=1X141=1X139=1X73=1X3=2I40=1X173=1X
242=1I263=1X1164=1X282=1I155=1X752=1X141=1X59=1X42=1X251=1X60=1X329=1X16=1X110=1X103=1X131=5X3=1X231=1X368=1X98=1X423=1X680=1X198=1X431=1I1=2I201=1X657=1D15=1X530=1X663=1X4=1X20=1X71=1X13=4D1=1D1=9D216=1X130=1X400=2X50=1X141=1X255=1X721=1X535=1X560=1X105=1I1=7I1=4I2=1I1=4I1=8I2=2I1=1I190=1X6=1X187=1X200=1X802=1X14=1X
49=1X125=1X14=1X73=1X16=1X70=1X141=1X13=1X30=1X39=1X145=1X67=1D143=1X25=1X19=2D2=3D2=1D20=1X196=1X15=1X24=1X101=1X144=1X36=1X21=1X1=1X49=1I166=1X54=1X9=1X22=1X67=1X69=1X7=1X82=1X135=1X157=1X54=2X31=1X281=1X161=1X3=1X107=1X4=1X71=1X34=1X31=1X56=1X87=1X2=2I1=4I3=1X7=1X21=1X17=1X115=1X11=1X36=1X308=1X95=1X92=1X47=1X153=
1X1=1X12=1X90=1X10=1X137=2X57=1X46=1X49=1X4=1X43=1X181=1X88=1X350=1X2=1X28=1X108=1X180=3I1=1I2=5I41=1X82=1X3=1X8=1X13=1X6=1X5=1X15=1I2=1X1=1D1=1X7=1X7=1I2=1D15=1X30=1X12=1X3=1X369=1X308=1X363=1X137=1X9=1D6=1X13=1X68=1X16=1X51=1X16=1X250=1X45=1X144=1X69=1X105=1X13=1X185=1X54=1X225=2X386=1X221=1D4=1X2=1X115=1X147=1X109
=1X53=1X14=1X6=1X67=1X17=1X114=1X15=1X16=1X76=1X14=1X14=1X28=1X113=1X12=1X59=1D2=1I16=1X7=1X11=1X17=1X52=1X3=1X27=1X712=1X24=1I1=11I3=1I1=3I1=2I1=2I2=1I1=4I52=1X4=1X36=1X79=1X414=1D49=1X493=1X57=1X93=1X414=1X69=1X362=1X56=1I10=1X1387=1X493=1X386=1X130=1X300=1X645=1X445=
Well, I went back to a paf file I had gotten from minimap2, and seqwish throws a seg fault on that one too. Here is the command and the stdout.
~/utils/seqwish/bin/seqwish -s Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.fa -t 12 -k 16 -p Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.paf -g Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.seqwish.GFA -P -V
[seqwish::seqidx] 0.000 indexing sequences
[seqwish::seqidx] 0.048 index built
[seqwish::alignments] 0.048 processing alignments
Segmentation fault
The thing is, I used to be able to run this to completion using an older version, but I didn't save it or the commit id. This one is from commit 39d3aa322 - the latest one.
How much disk space do you have available? This could be a problem with error reporting when you run out.
On Thu, Sep 3, 2020, 01:52 Eugene Goltsman notifications@github.com wrote:
Well, I went back to a paf file I had gotten from minimap2, and seqwish throws a seg fault on that one too. Here is the command and the stdout.
~/utils/seqwish/bin/seqwish -s Ref.Bd1_1.Bd21_3.Bd30_1CHR1.1mb_cutous.fa -t 12 -k 16 -p Ref.Bd1_1.Bd21_3.Bd30_1CHR1.1mb_cutous.minimap2.asm5.10kb+.paf -g Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.seqwish.GFA -P -V [seqwish::seqidx] 0.000 indexing sequences [seqwish::seqidx] 0.048 index built [seqwish::alignments] 0.048 processing alignments Segmentation fault
The thing is, I used to be able to run this to completion using an older version, but I didn't save it or the commit id. This one is from commit 39d3aa3 https://github.com/ekg/seqwish/commit/39d3aa32252c4739e2a5c5abb19cc581223eab8f
- the latest one.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-686128104, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEM36HADD6KIDUHNCZTSD3LFBANCNFSM4QJYXNUQ .
There is plenty of disk space - terabytes. I've even tried it with a tiny 50kb paf file from minigraph2, so it's probably not the space. I just put this small data set to the same portal site: https://portal.nersc.gov/dna/plant/assembly/Bdistachyon/ Could you confirm that you can run this commit fine on it? I'm including the end of strace below, if that helps. Thank you!
openat(AT_FDCWD, ".//sa_56278_0.sdsl", O_RDWR) = 3
openat(AT_FDCWD, ".//sa_56278_0.sdsl", O_RDONLY) = 4
read(4, "3\3\0\0\0\0\0\0\7\3649G\207U?c\23\340\357\302\242\271\246VvT2[Ce\24["..., 8191) = 113
lseek(4, 9, SEEK_SET) = 9
read(4, "\3649G\207U?c\23\340\357\302\242\271\246VvT2[Ce\24[\5c\f\243\24\203P\260\340"..., 1048579) = 104
read(4, "", 1048475) = 0
lseek(3, 0, SEEK_SET) = 0
write(3, "3\3\0\0\0\0\0\0\7", 9) = 9
lseek(3, 112, SEEK_SET) = 112
close(4) = 0
write(3, "\0", 1) = 1
close(3) = 0
unlink(".//bwt_56278_0.sdsl") = 0
unlink(".//sa_56278_0.sdsl") = 0
unlink(".//text_56278_0.sdsl") = 0
openat(AT_FDCWD, "foo.gfa.sqi.seqnames.tmp", O_RDONLY) = 3
read(3, ">Bd1_1_CHR1:9968681-11301495 >Bd"..., 8191) = 116
read(3, "", 8191) = 0
close(3) = 0
unlink("foo.gfa.sqi.seqnames.tmp") = 0
openat(AT_FDCWD, "foo.gfa.sqi", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
writev(3, [{iov_base="seqidx\1\0\0\0\4\0\0\0\0\0\0\0u\0\0\0\0\0\0\0\27\0\0\0\0\0"..., iov_len=1884}, {iov_base="\10\0\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=2048}], 2) = 3932
write(3, "\34\0\0\0\0\0\0\0\7\364\1\365\0\0\0\0\0\16\0\0\0\0\0\0\0\7\326\4\0\0\0\0"..., 920) = 920
close(3) = 0
openat(AT_FDCWD, "foo.gfa.sqq", O_RDWR) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=5085920, ...}) = 0
mmap(NULL, 5085920, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2aaaab084000
madvise(0x2aaaab084000, 5085920, MADV_WILLNEED) = 0
unlink("foo.gfa.sqa") = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4
mmap(0x10000000000, 6291456, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 5378048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaab55e000
mmap(NULL, 2101248, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x2aaaaba7f000
mprotect(0x2aaaaba80000, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
clone(child_stack=0x2aaaabc7e9f0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2aaaabc7f9d0, tls=0x2aaaabc7f700, child_tidptr=0x2aaaabc7f9d0) = 56279
openat(AT_FDCWD, "Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.paf", O_RDONLY) = 5
lseek(5, 0, SEEK_CUR) = 0
mmap(NULL, 2101248, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x2aaaabc80000
mprotect(0x2aaaabc81000, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
clone(child_stack=0x2aaaabe7f9f0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2aaaabe809d0, tls=0x2aaaabe80700, child_tidptr=0x2aaaabe809d0) = 56280
futex(0x2aaaabe809d0, FUTEX_WAIT, 56280, NULL) = 0
close(5) = 0
nanosleep({tv_sec=0, tv_nsec=1000000}, 0x7fffffff1788) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
mmap(NULL, 3191168, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x2aaab0000000
madvise(0x7fffffff16b8, 3191168, MADV_WILLNEED) = -1 EINVAL (Invalid argument)
munmap(0x2aaab0000000, 3191168) = 0
close(4) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
mmap(NULL, 3191168, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x2aaab0000000
madvise(0x2aaab0000000, 3191168, MADV_WILLNEED) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 5
fstat(5, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 6
fstat(6, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
munmap(0x2aaab0000000, 99724) = 0
close(4) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
mmap(NULL, 3191168, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x2aaab030c000
madvise(0x2aaab030c000, 3191168, MADV_WILLNEED) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x2aaab068bff0} ---
+++ killed by SIGSEGV +++
Segmentation fault
``
OK! So I was able to run successfully after rebuilding my environment and reinstalling. I have no idea what had changed. Sorry for this runabout!
OK, thanks so much for letting me know. Phew!
Hi, Moving this ticket here from https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.
Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip