ekg / seqwish

alignment to variation graph inducer
MIT License
143 stars 18 forks source link

Using alignments from edyeet results in an edge-less graph #62

Closed egoltsman closed 4 years ago

egoltsman commented 4 years ago

Hi, Moving this ticket here from https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.

Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip

ekg commented 4 years ago

What seqwish command did you use? Could you also share the input sequences. Usually this results from a problem with the naming in the FASTA file.

On Mon, Aug 24, 2020, 20:13 Eugene Goltsman notifications@github.com wrote:

Hi, Moving this ticket here from vgteam/vg#2957 https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.

Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip https://github.com/ekg/seqwish/files/5119521/Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQELVI57SMTRUBHZYP4DSCKUS5ANCNFSM4QJYXNUQ .

ekg commented 4 years ago

I mean that a mismatch in the naming in the PAF versus the FASTA could cause this. Another possibility is the -k setting could filter out all the matches in the alignments. Otherwise, it should not be possible so I'd appreciate the full test case for debugging.

On Tue, Aug 25, 2020, 10:23 Erik Garrison erik.garrison@gmail.com wrote:

What seqwish command did you use? Could you also share the input sequences. Usually this results from a problem with the naming in the FASTA file.

On Mon, Aug 24, 2020, 20:13 Eugene Goltsman notifications@github.com wrote:

Hi, Moving this ticket here from vgteam/vg#2957 https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.

Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip https://github.com/ekg/seqwish/files/5119521/Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQELVI57SMTRUBHZYP4DSCKUS5ANCNFSM4QJYXNUQ .

ekg commented 4 years ago

There is another possibility. If your input sequences are lower case. I thought we had added a warning for this case though.

On Tue, Aug 25, 2020, 10:26 Erik Garrison erik.garrison@gmail.com wrote:

I mean that a mismatch in the naming in the PAF versus the FASTA could cause this. Another possibility is the -k setting could filter out all the matches in the alignments. Otherwise, it should not be possible so I'd appreciate the full test case for debugging.

On Tue, Aug 25, 2020, 10:23 Erik Garrison erik.garrison@gmail.com wrote:

What seqwish command did you use? Could you also share the input sequences. Usually this results from a problem with the naming in the FASTA file.

On Mon, Aug 24, 2020, 20:13 Eugene Goltsman notifications@github.com wrote:

Hi, Moving this ticket here from vgteam/vg#2957 https://github.com/vgteam/vg/issues/2957 To reiterate, I was able so run edyeet on a set of four assemblies, all of chromosome 1 from four individuals of the same species. It produced what seems like a reasonable set of alignments, although the tags in the paf file are different from what I get with minimap2. When I feed this file to seqwish it produces a GFA file that contains segments for each input sequence, but there are no edges. I used the edyeet options you mentioned in the thread: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X foo.fa foo.fa > foo.paf. The paf file is attached below.

Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip https://github.com/ekg/seqwish/files/5119521/Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.zip

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQELVI57SMTRUBHZYP4DSCKUS5ANCNFSM4QJYXNUQ .

egoltsman commented 4 years ago

This was the seqwish command: ~/utils/seqwish/bin/seqwish -t 12 --seqs=Ref.Bd1_1.Bd21_3.Bd30_1CHR1.fa --paf-alns=Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.paf --gfa=Ref.Bd1_1.Bd21_3.Bd30_1CHR1.edyeet.seqwish.take2.gfa I verified that the fasta headers match the names in the paf file and that the sequences are uppercase.
The full data set is too big to attach here - is there a place where I could upload it?

egoltsman commented 4 years ago

I realized my seqwish install was rather old so I'm rebuilding from the latest commit now. I'll let you know if the same issue persists.

egoltsman commented 4 years ago

Hmm.. Having trouble building the latest version. Cmake is complaining about the ips4o library.
I'm running: gcc version 8.3.0 20190222 (Cray Inc.) (GCC) cmake version 3.10.2

[ 88%] Building CXX object CMakeFiles/seqwish.dir/src/sxs.cpp.o
In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:7:
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp: In member function 'void mmmulti::map<Key, Value>::sort(int)':
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp:327:16: error: 'ips4o::parallel' has not been declared
         ips4o::parallel::sort((std::pair<Key, Value>*)buffer.data,
                ^~~~~~~~
In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:8:
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp: In member function 'void mmmulti::iitree<S, T>::sort(int)':
/global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp:334:16: error: 'ips4o::parallel' has not been declared
         ips4o::parallel::sort((Interval*)buffer.data,
                ^~~~~~~~
ekg commented 4 years ago

Looks like a problem with the submodules. Try a fresh recursive clone.

I actually didn't realize that you can write = in the command line flags and have it still work. I'm usually using -p -s and -g to indicate the inputs and outputs.

But, I think this isn't the problem. Did you run minimap2 with -c? Are you sure that the alignments have CIGAR strings?

There should definitely be a warning in this case to avoid future confusion.

On Tue, Aug 25, 2020, 21:06 Eugene Goltsman notifications@github.com wrote:

Hmm.. Having trouble building the latest version. Cmake is complaining about the ips4o library. I'm running: gcc version 8.3.0 20190222 (Cray Inc.) (GCC) cmake version 3.10.2

[ 88%] Building CXX object CMakeFiles/seqwish.dir/src/sxs.cpp.o In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:7: /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp: In member function 'void mmmulti::map<Key, Value>::sort(int)': /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmmultimap.hpp:327:16: error: 'ips4o::parallel' has not been declared ips4o::parallel::sort((std::pair<Key, Value>)buffer.data, ^~~~ In file included from /global/homes/e/eugeneg/utils/seqwish/src/main.cpp:8: /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp: In member function 'void mmmulti::iitree<S, T>::sort(int)': /global/homes/e/eugeneg/utils/seqwish/deps/mmmulti/src/mmiitree.hpp:334:16: error: 'ips4o::parallel' has not been declared ips4o::parallel::sort((Interval)buffer.data, ^~~~

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-680215142, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQENXE4CCVFN57SY5DITSCQDR3ANCNFSM4QJYXNUQ .

ekg commented 4 years ago

Ahh, Sorry this alignment was based on edyeet.

Here I suspect that the problem is that older seqwish didn't handle the extended cigar that I'm writing from edyeet. Let's try to get the seqwish repo and build updated and test.

egoltsman commented 4 years ago

Same build errors after doing a fresh clone. Looks like a namespace issue wrt ips4o, but that's an uneducated guess. Just to be on the same page, what cmake and gcc versions are you using in your environment?

ekg commented 4 years ago

I've used anything above gcc v7.5 without issue. Both Debian and Ubuntu, and also some guix builds have worked fine. What versions do you have?

On Wed, Aug 26, 2020, 20:05 Eugene Goltsman notifications@github.com wrote:

Same build errors after doing a fresh clone. Looks like a namespace issue wrt ips4o, but that's an uneducated guess. Just to be on the same page, what cmake and gcc versions are you using in your environment?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-681037234, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEIVPGWYKN4QH36NMYLSCVFFFANCNFSM4QJYXNUQ .

egoltsman commented 4 years ago

I've tried 8.3 and 9.2 and different cmake versions - same error. It wouldn't be possible to get a static binary for this package, would it? Unfortunately, we can't run docker on our system due to security restrictions.

ekg commented 4 years ago

I would really like to figure out what's going on rather than attempting to provide a static binary. There could be other incompatibilities there.

I suspect your system is missing dependencies that would enable these flags in ips4o:

if defined(_REENTRANT) || defined(_OPENMP)

Usually, the -pthread flag to gcc should define _REENTRANT. But, it may be that this is not the correct flag on your system. You could also try adding -fopenmp to the same line in the CMakeLists.txt. Perhaps that would enable the namespace definition and resolve the problem.

This seems relevant:

https://lists.debian.org/debian-devel/2003/10/msg01538.html

On Thu, Aug 27, 2020 at 9:02 AM Eugene Goltsman notifications@github.com wrote:

I've tried 8.3 and 9.2 and different cmake versions - same error. It wouldn't be possible to get a static binary for this package, would it? Unfortunately, we can't run docker on our system due to security restrictions.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-681648974, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEKR7MCA2QZLBUWYXD3SCYAILANCNFSM4QJYXNUQ .

egoltsman commented 4 years ago

It looks like I have _REENTRANT defined if -pthread is passed:

gcc -x c /dev/null -E -dM -pthread | grep REENT

define _REENTRANT 1

..but I don't see this flag being included anywhere in the CMakeLists. Or do you mean that I should add it to CMAKE_CXX_FLAGS ?

egoltsman commented 4 years ago

Ok, I did just that and it worked! I will now try building the graph again. Will keep you posted.

egoltsman commented 4 years ago

Now I'm getting a Seg Fault with this data. I've verified that the fasta is properly formatted, but perhaps you could try it on your end. I posted both the fasta and the .paf file i get from edyeet to our portal: https://portal.nersc.gov/dna/plant/assembly/Bdistachyon/

Again, the commands were: edyeet -t 32 -s 10000 -p 85 -a 90 -n 10 -X Ref.Bd1_1.Bd21_3.Bd30_1CHR1.fa Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.fa > Ref.Bd1_1.Bd21_3.Bd30_1CHR1.paf

seqwish -t 32 --seqs=Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.fa --paf-alns=Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.paf --gfa=graph.gfa

Thanks!

egoltsman commented 4 years ago

Hi, I am still suspecting my edyeet output is not formatted properly (although I'm using the latest build). The CIGAR string doesn't look right. At first I suspected some gsl version issue, but it doesn't look like this is gsl's domain. Is this alignment entry correctly formatted?

Bd1_1_CHR1:9968681-11301495     1332815 100000  150000  +       Bd21_3_CHR1:9803073-11041285    1238213 100672  150471  49230   50000   19      id:f:0.986613   ma:i:49230      mm:i:423        ni:i:344        nd:i:146        ns:i:3  ed:i:916        al:i:50146      se:f:0.0182667  cg:Z:3I3=1I1=5I1=2I1=1I1=3I1=2I1=7I1=1
I36=4I6=1X6=1I54=5X1D65=1X44=1X94=1X113=1X63=1X317=1X1=1X213=1X283=2D81=1X58=1X1=1X27=1X80=1X8=1I1=4I1=1I1=9I1=9I1=3I2=1I4=1I2=2I1=2I6=2I1=2I1=1I1=11I2=6I1=1I2=2I2=8I2=3I1=3I23=2X28=1X87=1X78=1X20=2D42=1I19=1X1=1X31=1X24=10I3=3I1=9I1=1I1=2I1=4I2=2I2=1I1=2I1=1I26=1X25=1X319=1X243=2I92=1X49=1X2=1X10=1X29=1X52=1X24=1X11
=1X3=8D12=1D1=1D21=3D6=1X17=1X56=1X23=1X37=2X105=1X10=1X172=1X88=1X23=1X1=1X27=1X16=1X13=1X6=1X8=1X4=1X7=1X15=1I11=1X7=1X70=1X12=1X33=1X65=8D1154=1X44=1I15=1I16=1X202=1X187=4I1=1I1=10I1=7I1=2I1=7I3=1I1=5I1=1I1=1I1=1I1=4I2=5I1=5I4=1I2=2I2=2I1=2I1=1I1=2I1=2I1=1I3=5I4=7I5=7I2=5I1=2I1=2I1=6I1=1I790=1D51=1D74=1X426=1X169=
1X37=1X82=1X41=1X3=1X54=1X8=1X54=1X1I46=1X41=1X26=1X28=1X18=1X14=1X4=1X17=1X40=1X12=1X112=1X22=1X70=1X57=1D279=1X70=1X15=1X24=1X33=3D35=2D7=1D17=1X9=1X2=2D1=1D1=3D6=1X12=1X88=1X28=1X108=1X65=1X61=1X22=1X8=1X94=1X154=1X28=1X103=1X5=1X103=1X1I7=1D143=1X39=1X47=1X33=2X130=1X55=1X122=1I1=1I27=1X38=1X217=1X7=1X1=1X170=1X2
98=1D132=6D231=1X6=2X250=1X94=1X112=1X474=1X331=1X19=1X111=1X54=1X350=1X372=1X49=1X386=1X275=1X263=1X259=1X2=1X9=1D98=1X117=1X11=1X93=1X25=1X26=1X1=1X8=1X1I5=1I21=1X24=1X98=1X13=1X16D41=1I376=1X200=1X10=1X179=1X10=1X37=1X102=1D1=1D1=1X1=1X19=1X148=1X22=1X69=1X12=1X43=1X31=1X8=1X1=1X7=1X6=1X42=1X61=1I21=1X8=1X84=1X41=
1X38=1X120=2X28=1X28=1X403=1I8=1X21=1X29=1X34=1X33=1X21=2I76=1X126=1X19=1X28=1X45=11I50=1X17=1X9=1X38=1X81=1X11=1X36=1X94=2X18=1X56=1X42=1X49=1X107=1X9=1X14=1X31=1X4=1X6=1X11=1X9=1X27=2D3=1D1=10D1=6D1=3D1=1D1=1D1=1D3=1D1=1D1=5D2=3D1=2D1=1D2=1D1=1D2=4D1=2D1=3D1=1D152=1X173=1X258=1X41=1X141=1X139=1X73=1X3=2I40=1X173=1X
242=1I263=1X1164=1X282=1I155=1X752=1X141=1X59=1X42=1X251=1X60=1X329=1X16=1X110=1X103=1X131=5X3=1X231=1X368=1X98=1X423=1X680=1X198=1X431=1I1=2I201=1X657=1D15=1X530=1X663=1X4=1X20=1X71=1X13=4D1=1D1=9D216=1X130=1X400=2X50=1X141=1X255=1X721=1X535=1X560=1X105=1I1=7I1=4I2=1I1=4I1=8I2=2I1=1I190=1X6=1X187=1X200=1X802=1X14=1X
49=1X125=1X14=1X73=1X16=1X70=1X141=1X13=1X30=1X39=1X145=1X67=1D143=1X25=1X19=2D2=3D2=1D20=1X196=1X15=1X24=1X101=1X144=1X36=1X21=1X1=1X49=1I166=1X54=1X9=1X22=1X67=1X69=1X7=1X82=1X135=1X157=1X54=2X31=1X281=1X161=1X3=1X107=1X4=1X71=1X34=1X31=1X56=1X87=1X2=2I1=4I3=1X7=1X21=1X17=1X115=1X11=1X36=1X308=1X95=1X92=1X47=1X153=
1X1=1X12=1X90=1X10=1X137=2X57=1X46=1X49=1X4=1X43=1X181=1X88=1X350=1X2=1X28=1X108=1X180=3I1=1I2=5I41=1X82=1X3=1X8=1X13=1X6=1X5=1X15=1I2=1X1=1D1=1X7=1X7=1I2=1D15=1X30=1X12=1X3=1X369=1X308=1X363=1X137=1X9=1D6=1X13=1X68=1X16=1X51=1X16=1X250=1X45=1X144=1X69=1X105=1X13=1X185=1X54=1X225=2X386=1X221=1D4=1X2=1X115=1X147=1X109
=1X53=1X14=1X6=1X67=1X17=1X114=1X15=1X16=1X76=1X14=1X14=1X28=1X113=1X12=1X59=1D2=1I16=1X7=1X11=1X17=1X52=1X3=1X27=1X712=1X24=1I1=11I3=1I1=3I1=2I1=2I2=1I1=4I52=1X4=1X36=1X79=1X414=1D49=1X493=1X57=1X93=1X414=1X69=1X362=1X56=1I10=1X1387=1X493=1X386=1X130=1X300=1X645=1X445=
egoltsman commented 4 years ago

Well, I went back to a paf file I had gotten from minimap2, and seqwish throws a seg fault on that one too. Here is the command and the stdout.

~/utils/seqwish/bin/seqwish -s Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.fa -t 12 -k 16 -p Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.paf -g Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.seqwish.GFA -P -V 
[seqwish::seqidx] 0.000 indexing sequences
[seqwish::seqidx] 0.048 index built
[seqwish::alignments] 0.048 processing alignments
Segmentation fault

The thing is, I used to be able to run this to completion using an older version, but I didn't save it or the commit id. This one is from commit 39d3aa322 - the latest one.

ekg commented 4 years ago

How much disk space do you have available? This could be a problem with error reporting when you run out.

On Thu, Sep 3, 2020, 01:52 Eugene Goltsman notifications@github.com wrote:

Well, I went back to a paf file I had gotten from minimap2, and seqwish throws a seg fault on that one too. Here is the command and the stdout.

~/utils/seqwish/bin/seqwish -s Ref.Bd1_1.Bd21_3.Bd30_1CHR1.1mb_cutous.fa -t 12 -k 16 -p Ref.Bd1_1.Bd21_3.Bd30_1CHR1.1mb_cutous.minimap2.asm5.10kb+.paf -g Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.seqwish.GFA -P -V [seqwish::seqidx] 0.000 indexing sequences [seqwish::seqidx] 0.048 index built [seqwish::alignments] 0.048 processing alignments Segmentation fault

The thing is, I used to be able to run this to completion using an older version, but I didn't save it or the commit id. This one is from commit 39d3aa3 https://github.com/ekg/seqwish/commit/39d3aa32252c4739e2a5c5abb19cc581223eab8f

  • the latest one.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/62#issuecomment-686128104, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEM36HADD6KIDUHNCZTSD3LFBANCNFSM4QJYXNUQ .

egoltsman commented 4 years ago

There is plenty of disk space - terabytes. I've even tried it with a tiny 50kb paf file from minigraph2, so it's probably not the space. I just put this small data set to the same portal site: https://portal.nersc.gov/dna/plant/assembly/Bdistachyon/ Could you confirm that you can run this commit fine on it? I'm including the end of strace below, if that helps. Thank you!


openat(AT_FDCWD, ".//sa_56278_0.sdsl", O_RDWR) = 3
openat(AT_FDCWD, ".//sa_56278_0.sdsl", O_RDONLY) = 4
read(4, "3\3\0\0\0\0\0\0\7\3649G\207U?c\23\340\357\302\242\271\246VvT2[Ce\24["..., 8191) = 113
lseek(4, 9, SEEK_SET)                   = 9
read(4, "\3649G\207U?c\23\340\357\302\242\271\246VvT2[Ce\24[\5c\f\243\24\203P\260\340"..., 1048579) = 104
read(4, "", 1048475)                    = 0
lseek(3, 0, SEEK_SET)                   = 0
write(3, "3\3\0\0\0\0\0\0\7", 9)        = 9
lseek(3, 112, SEEK_SET)                 = 112
close(4)                                = 0
write(3, "\0", 1)                       = 1
close(3)                                = 0
unlink(".//bwt_56278_0.sdsl")           = 0
unlink(".//sa_56278_0.sdsl")            = 0
unlink(".//text_56278_0.sdsl")          = 0
openat(AT_FDCWD, "foo.gfa.sqi.seqnames.tmp", O_RDONLY) = 3
read(3, ">Bd1_1_CHR1:9968681-11301495 >Bd"..., 8191) = 116
read(3, "", 8191)                       = 0
close(3)                                = 0
unlink("foo.gfa.sqi.seqnames.tmp")      = 0
openat(AT_FDCWD, "foo.gfa.sqi", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
writev(3, [{iov_base="seqidx\1\0\0\0\4\0\0\0\0\0\0\0u\0\0\0\0\0\0\0\27\0\0\0\0\0"..., iov_len=1884}, {iov_base="\10\0\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=2048}], 2) = 3932
write(3, "\34\0\0\0\0\0\0\0\7\364\1\365\0\0\0\0\0\16\0\0\0\0\0\0\0\7\326\4\0\0\0\0"..., 920) = 920
close(3)                                = 0
openat(AT_FDCWD, "foo.gfa.sqq", O_RDWR) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=5085920, ...}) = 0
mmap(NULL, 5085920, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2aaaab084000
madvise(0x2aaaab084000, 5085920, MADV_WILLNEED) = 0
unlink("foo.gfa.sqa")                   = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4
mmap(0x10000000000, 6291456, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 5378048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaab55e000
mmap(NULL, 2101248, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x2aaaaba7f000
mprotect(0x2aaaaba80000, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
clone(child_stack=0x2aaaabc7e9f0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2aaaabc7f9d0, tls=0x2aaaabc7f700, child_tidptr=0x2aaaabc7f9d0) = 56279
openat(AT_FDCWD, "Ref.Bd1_1.Bd21_3.Bd30_1__CHR1.1mb_cutous.minimap2.asm5.10kb+.paf", O_RDONLY) = 5
lseek(5, 0, SEEK_CUR)                   = 0
mmap(NULL, 2101248, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x2aaaabc80000
mprotect(0x2aaaabc81000, 2097152, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
clone(child_stack=0x2aaaabe7f9f0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2aaaabe809d0, tls=0x2aaaabe80700, child_tidptr=0x2aaaabe809d0) = 56280
futex(0x2aaaabe809d0, FUTEX_WAIT, 56280, NULL) = 0
close(5)                                = 0
nanosleep({tv_sec=0, tv_nsec=1000000}, 0x7fffffff1788) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
mmap(NULL, 3191168, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x2aaab0000000
madvise(0x7fffffff16b8, 3191168, MADV_WILLNEED) = -1 EINVAL (Invalid argument)
munmap(0x2aaab0000000, 3191168)         = 0
close(4)                                = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
mmap(NULL, 3191168, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x2aaab0000000
madvise(0x2aaab0000000, 3191168, MADV_WILLNEED) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 5
fstat(5, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 6
fstat(6, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
munmap(0x2aaab0000000, 99724)           = 0
close(4)                                = 0
openat(AT_FDCWD, "foo.gfa.sqa", O_RDWR) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=3191168, ...}) = 0
mmap(NULL, 3191168, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) = 0x2aaab030c000
madvise(0x2aaab030c000, 3191168, MADV_WILLNEED) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x2aaab068bff0} ---
+++ killed by SIGSEGV +++
Segmentation fault
``
egoltsman commented 4 years ago

OK! So I was able to run successfully after rebuilding my environment and reinstalling. I have no idea what had changed. Sorry for this runabout!

ekg commented 4 years ago

OK, thanks so much for letting me know. Phew!