vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.12k stars 194 forks source link

vg is no longer portable #1303

Closed ekg closed 6 years ago

ekg commented 6 years ago

As of 620fda3d I am no longer able to build a portable binary from vg.

Is there any recent dependency or change to the build which could have added SSE instructions from the 4.2 set? I'm trying to figure out which instruction it is.

ekg commented 6 years ago

The change occurred since 64a436dc.

ekg commented 6 years ago

Following these instructions: https://superuser.com/questions/885136/how-to-detect-binary-compatibility-with-the-sse4-instruction-set, I was able to dump out a list of the instruction families used in the static binary.

-> % python binary_families.py vg-620fda3d.1
These instruction families were used:
186_Base, 286_Base, 386_Base, 8086_Base, AMD_SSE5, ARM_THUMB, Base, KATMAI_Base, KATMAI_MMX, KATMAI_SSE, NEHALEM_Base, P6_Base, PENT_Base, PENT_MMX, PRESCOTT_SSE3, SANDYBRIDGE_AVX, SSE2, SSE41, SSE42, X64_Base, X64_MMX, X64_SSE, X64_SSE2, X64_SSE41
These instructions could not be categorized:
(bad), addr32, andn, cltd, cltq, cmova, cmovae, cmovb, cmovbe, cmove, cmovg, cmovge, cmovl, cmovle, cmovne, cmovns, cmovs, cqto, cvtsi2sdl, cvtsi2sdq, cvtsi2ssl, cvtsi2ssq, cwtl, data32, decb, decl, divl, divq, es, faddl, fadds, fcompl, fdivl, fdivrl, fdivrs, fildll, fistpll, fldl, flds, fldt, fmull, fmuls, fs, fstl, fstpl, fstpt, fsubl, fsubrl, fsubrs, fsubs, idivl, idivq, ja, jae, jb, jbe, je, jg, jge, jl, jle, jne, jno, jnp, jns, jo, jp, js, leaveq, ljmpq, lock, movabs, movsbl, movsbq, movsbw, movslq, movswl, movswq, movzbl, movzwl, mulb, mulq, negl, negq, nopl, nopw, notl, notq, outsl, pdep, rdrand, rep, repnz, repz, rex, rex.B, rex.R, rex.RX, rex.RXB, rex.W, rex.WB, rex.WR, rex.WRX, rex.WRXB, seta, setae, setb, setbe, sete, setg, setge, setl, setle, setne, setnp, setp, shlx, shrx, tzcnt, vcvtsi2sdq, vextracti128, vinserti128, vpbroadcastb, vpbroadcastd, vpbroadcastq, vperm2i128, vpmaskmovq

Compare this to the last working version from my system:

-> % python binary_families.py vg-64a436dc.1
These instruction families were used:
186_Base, 286_Base, 386_Base, 8086_Base, AMD_SSE5, ARM_THUMB, Base, KATMAI_Base, KATMAI_MMX, KATMAI_SSE, NEHALEM_Base, P6_Base, PENT_Base, PENT_MMX, PRESCOTT_SSE3, SANDYBRIDGE_AVX, SSE2, SSE41, SSE42, X64_Base, X64_MMX, X64_SSE, X64_SSE2, X64_SSE41
These instructions could not be categorized:
(bad), addr32, cltd, cltq, cmova, cmovae, cmovb, cmovbe, cmove, cmovg, cmovge, cmovl, cmovle, cmovne, cmovns, cmovs, cqto, cvtsi2sdl, cvtsi2sdq, cvtsi2ssl, cvtsi2ssq, cwtl, data32, decb, decl, divl, divq, es, faddl, fadds, fcompl, fdivl, fdivrl, fdivrs, fildll, fistpll, fldl, flds, fldt, fmull, fmuls, fs, fstl, fstpl, fstpt, fsubl, fsubrl, fsubrs, fsubs, idivl, idivq, ja, jae, jb, jbe, je, jg, jge, jl, jle, jne, jno, jnp, jns, jo, jp, js, leaveq, ljmpq, lock, movabs, movsbl, movsbq, movsbw, movslq, movswl, movswq, movzbl, movzwl, mulb, mulq, negl, negq, nopl, nopw, notl, notq, outsl, rdrand, rep, repnz, repz, rex, rex.B, rex.R, rex.RXB, rex.W, rex.WB, rex.WR, rex.WRX, rex.WRXB, seta, setae, setb, setbe, sete, setg, setge, setl, setle, setne, setnp, setp, tzcnt, vpbroadcastb

So it isn't necessarily the 4.2 set, but it surprises me that that's there because this version worked on systems that did not use SSE4.2. They must not have hit that set of instructions in the code path I was running.

These seem to be new though:

vcvtsi2sdq, vextracti128, vinserti128, vpbroadcastd, vpbroadcastq, vperm2i128, vpmaskmovq
ekg commented 6 years ago

I think they are coming from gcsa2. I checked the static libs using objdump -d:

-> % for s in $(ls lib/*a); do echo $s ; objdump -d $s | grep vpmaskmovq; done
lib/lib3edgeconnected.a
lib/libdivsufsort64.a
lib/libdivsufsort.a
lib/libfml.a
lib/libgbwt.a
lib/libgcsa2.a
 189:   c4 c2 fd 8c 1e          vpmaskmovq (%r14),%ymm0,%ymm3
 18e:   c4 e2 fd 8e 18          vpmaskmovq %ymm3,%ymm0,(%rax)
 1fb:   c4 42 8d 8c 3e          vpmaskmovq (%r14),%ymm14,%ymm15
 200:   c4 62 8d 8e 38          vpmaskmovq %ymm15,%ymm14,(%rax)
 3de:   c4 42 9d 8c 2e          vpmaskmovq (%r14),%ymm12,%ymm13
 3e3:   c4 62 9d 8e 2e          vpmaskmovq %ymm13,%ymm12,(%rsi)
 449:   c4 62 f5 8c 00          vpmaskmovq (%rax),%ymm1,%ymm8
 44e:   c4 62 f5 8e 06          vpmaskmovq %ymm8,%ymm1,(%rsi)
 bf0:   c4 c2 d5 8c 36          vpmaskmovq (%r14),%ymm5,%ymm6
 bf5:   c4 e2 d5 8e 30          vpmaskmovq %ymm6,%ymm5,(%rax)
 bff:   c4 62 cd 8c 10          vpmaskmovq (%rax),%ymm6,%ymm10
 c04:   c4 62 cd 8e 16          vpmaskmovq %ymm10,%ymm6,(%rsi)
...

Of course they're also ending up in libvg.a.

JervenBolleman commented 6 years ago

vinserti128 etc seem AVX not SSE?

I think that this is being stamped on by the settings in the GSCA2 make file. This plus depending on which cpu and CC you compile on means you might need to set -msse4.2 and -mno-avx

ekg commented 6 years ago

@adamnovak why did we add -ldl in 3a4e1ba? Not sure this is the problem, I'm just trying to figure out what changes there have been to the Makefile since the last working version.

ekg commented 6 years ago

@JervenBolleman nothing seems to have changed in gcsa2 since the last version I was able to build in a portable way. However, this bit was changed in the vg Makefile in c1f4b85c4:

-       +. ./source_me.sh && cd $(GCSA2_DIR) && cat Makefile | grep -v VERBOSE_STATUS_INFO >Makefile.quiet && $(MAKE) -f Makefile.quiet libgcsa2.a $(FILTER) && mv libgcsa2.a $(CWD)/$(LIB_DIR) && cp -r include/gcsa $(CWD)/$(INC_DIR)/
+       +. ./source_me.sh && cd $(GCSA2_DIR) && cat Makefile | grep -v VERBOSE_STATUS_INFO >Makefile.quiet && AS_INTEGRATED_ASSEMBLER=1 $(MAKE) -f Makefile.quiet libgcsa2.a $(FILTER) && mv libgcsa2.a $(CWD)/$(LIB_DIR) && cp -r include/gcsa $(CWD)/$(INC_DIR)/

Now I'm trying to grok what AS_INTEGRATED_ASSEMBLER=1 does.

jltsiren commented 6 years ago

I guess this is coming from SDSL. They added -march=native to the default compile flags in the summer.

ekg commented 6 years ago

It looks like this is the problem (in sdsl/Make.helper, which gcsa2 is pulling in):

MY_CXX_FLAGS= -std=c++11 -march=native -Wall -Wextra -DNDEBUG $(CODE_COVER)

edit: Ah, I just caught @jltsiren's comment.

ekg commented 6 years ago

@jltsiren I believe I resolved this here: https://github.com/simongog/sdsl-lite/pull/387

But I didn't catch the Make.helper bit. Could that be the problem?

ekg commented 6 years ago

This makes it seem that my portable builds (on a remote VM) were a fluke due to the architecture of the host, and not something that had to do with the system libraries. So I hadn't really solved this problem.

adamnovak commented 6 years ago

The -ldl is for some functions for inspecting dynamically-linked libraries. I needed it for some stack-tracing code I had to add to debug segfaults on Mac Travis that I couldn't reproduce locally.

I think it's really only needed on OS X the way I have the #ifdef guards set up right now, but it's in the list for both platforms.