Closed freijon closed 1 year ago
Color me surprised to learn that Gentoo of all distros is relying on prebuilt upstream binaries for their packaging.
I expected a comment like this :D It's the solution for lazy people. The main package is built from source of course, but there is a binary package for people who don't want to compile and maintain 200+ Haskell packages just to run pandoc ;)
I'd like to be sure that this is coming from pandoc and not lualatex (which will be called given the command you've used). Can you reproduce this using a simpler command (not producing a PDF)? Also, could you try with this command, but with --verbose
, which may give us a better indication of where this is occurring?
Possibly relevant: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/1306
I don't know much about this, but it could be that ghc determines dynamically whether the processor it's running on supports AVX, and then uses these instructions if it does. (I'm guessing our build machine does.) I'm not (yet) seeing any way to tell it not to do this.
I haven't seen this reported before: is that because only fairly old machines don't support AVX at this point?
Actually there is a flag for avx (from ghc 9.6 manual):
-m avx
(x86 only) These SIMD instructions are currently not supported by the native code generator. Enabling this flag has no effect and is only present for future extensions.The LLVM backend may use AVX if your processor supports it, but detects this automatically, so no flag is required.
My understanding is that ghc uses the native code generator by default.
I'd like to be sure that this is coming from pandoc and not lualatex (which will be called given the command you've used). Can you reproduce this using a simpler command (not producing a PDF)? Also, could you try with this command, but with
--verbose
, which may give us a better indication of where this is occurring?
Here are the results:
pandoc --verbose <old arguments>
--> no additional output that would help...pandoc --verbose in_file.md -o out_file.html
--> same error "illegal hardware instruction"EDIT:
pandoc --help
displays the help correctlyOK, that's helpful. Does it matter what is in in_file.md
? Can it be just one word, for example?
I just tried it with only "test" in in_file.md
- same result
@mpickering as a ghc dev I was hoping you might have insight into this?
As far as I'm aware ghcs native backend can't emit this instruction. This means it was likely the result of missguided optimization either in a library, ghcs RTS, or through the llvm backend.
For any more insight we would need to know which ghc version/libraries were used to build this release. A likely culprit seems the text
library which recently started using SIMD via C bindings for some functionality.
@AndreasPK thanks for commenting here. I don't have the exact list for that build, but I triggered a new release build and made it emit a cabal freeze. These should be roughly the same versions of packages, as the last release was just last week. text is version 2.0.2. Another place to look is the whole new crypton ecosystem, I suppose, since that is new in the last pandoc release; if the problem lies there, it would explain why I haven't gotten other reports like this. (On the other hand, it could just be that people are using pandoc on relatively recent hardware.)
ghc version: ghc 9.6.2, from Docker image glcr.b-data.ch/ghc/ghc-musl:9.6.2
Wrote freeze file: /tmp/cirrus-ci-build/cabal.project.freeze
active-repositories: hackage.haskell.org:merge
constraints: any.Cabal ==3.10.1.0,
any.Cabal-syntax ==3.10.1.0,
any.Diff ==0.4.1,
any.Glob ==0.10.2,
any.HUnit ==1.6.2.0,
any.JuicyPixels ==3.3.8,
JuicyPixels -mmap,
any.OneTuple ==0.4.1.1,
any.Only ==0.1,
any.QuickCheck ==2.14.3,
QuickCheck -old-random +templatehaskell,
any.SHA ==1.6.4.4,
SHA -exe,
any.StateVar ==1.2.2,
any.aeson ==2.1.2.1,
aeson -cffi +ordered-keymap,
any.aeson-pretty ==0.8.10,
aeson-pretty -lib-only,
any.alex ==3.4.0.0,
any.ansi-terminal ==1.0,
ansi-terminal -example,
any.ansi-terminal-types ==0.11.5,
any.appar ==0.1.8,
any.array ==0.5.5.0,
any.asn1-encoding ==0.9.6,
any.asn1-parse ==0.9.5,
any.asn1-types ==0.3.4,
any.assoc ==1.1,
assoc +tagged,
any.async ==2.2.4,
async -bench,
any.attoparsec ==0.14.4,
attoparsec -developer,
any.attoparsec-aeson ==2.1.0.0,
any.attoparsec-iso8601 ==1.1.0.0,
any.auto-update ==0.1.6,
any.base ==4.18.0.0,
any.base-compat ==0.13.0,
any.base-compat-batteries ==0.13.0,
any.base-orphans ==0.9.0,
any.base-unicode-symbols ==0.2.4.2,
base-unicode-symbols +base-4-8 -old-base,
any.base16-bytestring ==1.0.2.0,
any.base64 ==0.4.2.4,
any.base64-bytestring ==1.2.1.0,
any.basement ==0.0.16,
any.bifunctors ==5.6.1,
bifunctors +tagged,
any.binary ==0.8.9.1,
any.bitvec ==1.1.4.0,
bitvec -libgmp,
any.blaze-builder ==0.4.2.2,
any.blaze-html ==0.9.1.2,
any.blaze-markup ==0.8.2.8,
any.boring ==0.2.1,
boring +tagged,
any.bsb-http-chunked ==0.0.0.4,
any.byteorder ==1.0.4,
any.bytestring ==0.11.4.0,
any.cabal-doctest ==1.0.9,
any.call-stack ==0.4.0,
any.case-insensitive ==1.2.1.0,
any.cassava ==0.5.3.0,
cassava -bytestring--lt-0_10_4,
any.cereal ==0.5.8.3,
cereal -bytestring-builder,
any.citeproc ==0.8.1,
citeproc -executable -icu,
any.cmdargs ==0.10.22,
cmdargs +quotation -testprog,
any.colour ==2.3.6,
any.commonmark ==0.2.3,
any.commonmark-extensions ==0.2.3.4,
any.commonmark-pandoc ==0.2.1.3,
any.comonad ==5.0.8,
comonad +containers +distributive +indexed-traversable,
any.conduit ==1.3.5,
any.conduit-extra ==1.3.6,
any.constraints ==0.13.4,
any.containers ==0.6.7,
any.contravariant ==1.5.5,
contravariant +semigroups +statevar +tagged,
any.cookie ==0.4.6,
any.crypton ==0.33,
crypton -check_alignment +integer-gmp -old_toolchain_inliner +support_aesni +support_deepseq +support_pclmuldq +support_rdrand -support_sse +use_target_attributes,
any.crypton-connection ==0.3.1,
any.crypton-x509 ==1.7.6,
any.crypton-x509-store ==1.6.9,
any.crypton-x509-system ==1.6.7,
any.crypton-x509-validation ==1.6.12,
any.cryptonite ==0.30,
cryptonite -check_alignment +integer-gmp -old_toolchain_inliner +support_aesni +support_deepseq -support_pclmuldq +support_rdrand -support_sse +use_target_attributes,
any.data-default ==0.7.1.1,
any.data-default-class ==0.1.2.0,
any.data-default-instances-containers ==0.0.1,
any.data-default-instances-dlist ==0.0.1,
any.data-default-instances-old-locale ==0.0.1,
any.data-fix ==0.3.2,
any.dec ==0.0.5,
any.deepseq ==1.4.8.1,
any.digest ==0.0.1.3,
digest -bytestring-in-base,
any.digits ==0.3.1,
any.directory ==1.3.8.1,
any.distributive ==0.6.2.1,
distributive +semigroups +tagged,
any.dlist ==1.0,
dlist -werror,
any.doclayout ==0.4.0.1,
any.doctemplates ==0.11,
any.easy-file ==0.2.5,
any.emojis ==0.1.2,
any.exceptions ==0.10.7,
any.fast-logger ==3.2.2,
any.file-embed ==0.0.15.0,
any.filepath ==1.4.100.1,
any.generically ==0.1.1,
any.ghc-bignum ==1.3,
any.ghc-boot-th ==9.6.2,
any.ghc-prim ==0.10.0,
any.gridtables ==0.1.0.0,
any.haddock-library ==1.11.0,
any.happy ==1.20.1.1,
any.hashable ==1.4.2.0,
hashable +integer-gmp -random-initial-seed,
any.haskell-lexer ==1.1.1,
any.hourglass ==0.2.12,
any.hsc2hs ==0.68.9,
hsc2hs -in-ghc-tree,
any.hslua ==2.3.0,
any.hslua-aeson ==2.3.0.1,
any.hslua-classes ==2.3.0,
any.hslua-cli ==1.4.1,
hslua-cli -executable,
any.hslua-core ==2.3.1,
any.hslua-list ==1.1.1,
any.hslua-marshalling ==2.3.0,
any.hslua-module-doclayout ==1.1.0,
any.hslua-module-path ==1.1.0,
any.hslua-module-system ==1.1.0.1,
any.hslua-module-text ==1.1.0.1,
any.hslua-module-version ==1.1.0,
any.hslua-module-zip ==1.1.0,
any.hslua-objectorientation ==2.3.0,
any.hslua-packaging ==2.3.0,
any.hslua-repl ==0.1.1,
hslua-repl -executable,
any.hslua-typing ==0.1.0,
any.http-api-data ==0.5.1,
http-api-data -use-text-show,
any.http-client ==0.7.13.1,
http-client +network-uri,
any.http-client-tls ==0.3.6.2,
any.http-date ==0.0.11,
any.http-media ==0.8.0.0,
any.http-types ==0.12.3,
any.http2 ==4.1.4,
http2 -devel -h2spec,
any.indexed-traversable ==0.1.2.1,
any.indexed-traversable-instances ==0.1.1.2,
any.integer-gmp ==1.1,
any.integer-logarithms ==1.0.3.1,
integer-logarithms -check-bounds +integer-gmp,
any.iproute ==1.7.12,
any.ipynb ==0.2,
any.isocline ==1.0.9,
any.jira-wiki-markup ==1.5.1,
any.libyaml ==0.1.2,
libyaml -no-unicode -system-libyaml,
any.lpeg ==1.0.4,
lpeg -rely-on-shared-lpeg-library,
any.lua ==2.3.1,
lua +allow-unsafe-gc -apicheck -cross-compile +export-dynamic -lua_32bits -pkg-config -system-lua,
any.lua-arbitrary ==1.0.1.1,
any.memory ==0.18.0,
memory +support_bytestring +support_deepseq,
any.mime-types ==0.1.1.0,
any.mmorph ==1.2.0,
any.monad-control ==1.0.3.1,
any.mono-traversable ==1.0.15.3,
any.mtl ==2.3.1,
any.network ==3.1.4.0,
network -devel,
any.network-byte-order ==0.1.6,
any.network-uri ==2.6.4.2,
any.old-locale ==1.0.0.7,
any.old-time ==1.1.0.3,
any.optparse-applicative ==0.18.1.0,
optparse-applicative +process,
any.ordered-containers ==0.2.3,
pandoc +embed_data_files,
pandoc-cli +lua -nightly +server,
any.pandoc-lua-marshal ==0.2.2,
any.pandoc-types ==1.23.0.1,
any.parsec ==3.1.16.1,
any.pem ==0.2.4,
any.pretty ==1.1.3.6,
any.pretty-show ==1.10,
any.prettyprinter ==1.7.1,
prettyprinter -buildreadme +text,
any.prettyprinter-ansi-terminal ==1.1.3,
any.primitive ==0.8.0.0,
any.process ==1.6.17.0,
any.psqueues ==0.2.7.3,
any.random ==1.2.1.1,
any.recv ==0.1.0,
any.regex-base ==0.94.0.2,
any.regex-tdfa ==1.3.2.1,
regex-tdfa -force-o2,
any.resourcet ==1.3.0,
any.rts ==1.0.2,
any.safe ==0.3.19,
any.safe-exceptions ==0.1.7.4,
any.scientific ==0.3.7.0,
scientific -bytestring-builder -integer-simple,
any.semialign ==1.3,
semialign +semigroupoids,
any.semigroupoids ==6.0.0.1,
semigroupoids +comonad +containers +contravariant +distributive +tagged +unordered-containers,
any.servant ==0.20,
any.servant-server ==0.20,
any.simple-sendfile ==0.2.32,
simple-sendfile +allow-bsd -fallback,
any.singleton-bool ==0.1.7,
any.skylighting ==0.13.4,
skylighting -executable,
any.skylighting-core ==0.13.4,
skylighting-core -executable,
any.skylighting-format-ansi ==0.1,
any.skylighting-format-blaze-html ==0.1.1,
any.skylighting-format-context ==0.1.0.2,
any.skylighting-format-latex ==0.1,
any.socks ==0.6.1,
any.some ==1.0.5,
some +newtype-unsafe,
any.sop-core ==0.5.0.2,
any.split ==0.2.3.5,
any.splitmix ==0.1.0.4,
splitmix -optimised-mixer,
any.stm ==2.5.1.0,
any.streaming-commons ==0.2.2.6,
streaming-commons -use-bytestring-builder,
any.strict ==0.5,
any.string-conversions ==0.4.0.1,
any.syb ==0.7.2.3,
any.tagged ==0.8.7,
tagged +deepseq +transformers,
any.tagsoup ==0.14.8,
any.tasty ==1.4.3,
tasty +unix,
any.tasty-bench ==0.3.4,
tasty-bench -debug +tasty,
any.tasty-golden ==2.3.5,
tasty-golden -build-example,
any.tasty-hunit ==0.10.0.3,
any.tasty-lua ==1.1.0,
any.tasty-quickcheck ==0.10.2,
any.template-haskell ==2.20.0.0,
any.temporary ==1.3,
any.texmath ==0.12.8,
texmath -executable -server,
any.text ==2.0.2,
any.text-conversions ==0.3.1.1,
any.text-short ==0.1.5,
text-short -asserts,
any.th-abstraction ==0.5.0.0,
any.th-compat ==0.1.4,
any.th-lift ==0.8.3,
any.th-lift-instances ==0.1.20,
any.these ==1.2,
any.time ==1.12.2,
any.time-compat ==1.9.6.1,
time-compat -old-locale,
any.time-manager ==0.0.0,
any.tls ==1.7.0,
tls +compat -hans +network,
any.toml-parser ==1.2.0.0,
any.transformers ==0.6.1.0,
any.transformers-base ==0.4.6,
transformers-base +orphaninstances,
any.transformers-compat ==0.7.2,
transformers-compat -five +five-three -four +generic-deriving +mtl -three -two,
any.type-equality ==1,
any.typed-process ==0.2.11.0,
any.typst ==0.3.0.0,
typst -executable,
any.typst-symbols ==0.1.2,
any.unicode-collation ==0.1.3.4,
unicode-collation -doctests -executable,
any.unicode-data ==0.4.0.1,
unicode-data -ucd2haskell,
any.unicode-transforms ==0.4.0.1,
unicode-transforms -bench-show -dev -has-icu -has-llvm -use-gauge,
any.uniplate ==1.6.13,
any.unix ==2.8.1.0,
any.unix-compat ==0.7,
unix-compat -old-time,
any.unix-time ==0.4.10,
any.unliftio ==0.2.25.0,
any.unliftio-core ==0.2.1.0,
any.unordered-containers ==0.2.19.1,
unordered-containers -debug,
any.utf8-string ==1.0.2,
any.uuid-types ==1.0.5,
any.vault ==0.3.1.5,
vault +useghc,
any.vector ==0.13.0.0,
vector +boundschecks -internalchecks -unsafechecks -wall,
any.vector-algorithms ==0.9.0.1,
vector-algorithms +bench +boundschecks -internalchecks -llvm +properties -unsafechecks,
any.vector-stream ==0.1.0.0,
any.wai ==3.2.3,
any.wai-app-static ==3.1.7.4,
wai-app-static +cryptonite -print,
any.wai-cors ==0.2.7,
any.wai-extra ==3.1.13.0,
wai-extra -build-example,
any.wai-logger ==2.4.0,
any.warp ==3.3.28,
warp +allow-sendfilefd -network-bytestring -warp-debug +x509,
any.witherable ==0.4.2,
any.word8 ==0.1.3,
any.xml ==1.3.14,
any.xml-conduit ==1.9.1.3,
any.xml-types ==0.3.8,
any.yaml ==0.11.11.2,
yaml +no-examples +no-exe,
any.zip-archive ==0.4.3,
zip-archive -executable,
any.zlib ==0.6.3.0,
Seems you depend on text >= 2.0 which comes with the new simd code.
One "easy" way to check if it's text should be to disabled simd for text in a build using the simdutf
cabal flag and see if the error still persis.
OK, I think I've built a version using the release build script with a constraint that forces text to use -simdutf
.
@freijon could you try downloading the build artifact from here and see if you still get the error on your system?
https://cirrus-ci.com/task/4511237447352320
I gave it a try, but unfortunately I still get the same error. I also tried --version and noticed that pandoc
outputs some text and then fails:
/tmp/pandoc/pandoc-3.1.5/bin/pandoc --version --verbose
pandoc 3.1.5 Features: +server +lua [1] 3102 illegal hardware instruction /tmp/pandoc/pandoc-3.1.5/bin/pandoc --version --verbose
Thank you for your patience and your efforts so far, I appreciate it!
OK, that is helpful information. It suggests that the culprit is not +simdutf
in text
. @AndreasPK any other ideas?
Actually I think this is a good clue, that --version
emits those lines then stops.
versionInfo :: IO ()
versionInfo = do
progname <- getProgName
defaultDatadir <- defaultUserDataDir
scriptingEngine <- getEngine
putStr $ unlines
[ progname ++ " " ++ showVersion pandocVersion ++ versionSuffix
, flagSettings
, "Scripting engine: " ++ T.unpack (engineName scriptingEngine)
, "User data directory: " ++ defaultDatadir
, copyrightMessage
]
exitSuccess
That suggests that the error occurs in the "Scripting engine" part (so, getEngine
).
That may implicate the Lua subsystem, which obviously has pieces in C. Maybe the C is being compiled with these optimizations; we just need to figure out how to turn that off.
To test this hypothesis I'll try making a build without lua support, which you can try.
OK, the following build disables both the server
and the lua
flags (as well as simdutf
for text):
https://cirrus-ci.com/task/4556227045228544
@freijon It will be interesting to see if the problem can be reproduced with this binary.
Thanks!
pandoc --version
now works! I see the complete version info. Some progress!
Unfortunately, converting a .md to .html still fails with a SIGILL
Does your .md have YAML metadata? I ask because the yaml library embeds a C library. Do you still get the problem when converting a minimal md file (one word)?
My test-.md indeed had some special things like bullet list and headings. I did another test with only one word inside. Still get a SIGILL
Some notes:
We switched to ghc-musl 9.6.2 on June 26 (3.1.5 was built with this). And to ghc-musl 9.4.5 on April 20 (3.1.3 and 3.1.4 were built with this).
I'm pinging @benz0li who maintains the ghc-musl images and might know something else that could be relevant to this issue.
We switched to the crypton ecosystem for the 3.1.4 build (but this doesn't affect 3.1.3).
I'll note that both this and the related Windows issue point to ghc 9.4 as a possible culprit:
I guess there is an easy way to test this hypothesis. I can do a linux build using ghc 9.2, but otherwise the same as the last release.
Update: actually, it looks like ghc-musl-9.4.4 was used for release pandoc 3.1.2, and we switched to 9.4.5 for 3.1.3.
Actually there is a flag for avx (from ghc 9.6 manual):
-m avx
(x86 only) These SIMD instructions are currently not supported by the native code generator. Enabling this flag has no effect and is only present for future extensions.The LLVM backend may use AVX if your processor supports it, but detects this automatically, so no flag is required.
My understanding is that ghc uses the native code generator by default.
ℹ️ glcr.b-data.ch/ghc/ghc-musl uses the LLVM backend.
I haven't seen this reported before: is that because only fairly old machines don't support AVX at this point?
Yes. Advanced Vector Extensions (AVX) were introduced 12 years ago.
ghc 9.4.5 bumps text to 2.0.2 in core libraries.
ℹ️ glcr.b-data.ch/ghc/ghc-musl uses the LLVM backend.
Aha! That is something I didn't know. OK, so is there a way to prevent the llvm backend from using avx? And what is the reason for using the llvm backend in ghc-musl?
For testing purposes, here is a build of 3.1.5 that uses ghc-musl-9.4.4: https://cirrus-ci.com/task/5225158336577536
OK, so is there a way to prevent the llvm backend from using avx?
I don't know.
And what is the reason for using the llvm backend in ghc-musl?
I try to build GHC (almost) the same way as the official Alpine Linux package.
ℹ️ https://gitlab.haskell.org/ghc/ghc/-/issues/23482#note_503004
For testing purposes, here is a build of 3.1.5 that uses ghc-musl-9.4.4: https://cirrus-ci.com/task/5225158336577536
I tested this new binary and it behaves like the "normal" binary:
pandoc/pandoc-3.1.5/bin/pandoc --version
pandoc 3.1.5
Features: +server +lua
[1] 12804 illegal hardware instruction pandoc/pandoc-3.1.5/bin/pandoc --version@benz0li
ℹ️ glcr.b-data.ch/ghc/ghc-musl uses the LLVM backend.
Do you mean that this version of ghc was compiled using the llvm backend? (That shouldn't affect its behavior when run, should it?) Or that, when this ghc is used, it defaults to using the llvm backend rather than the native code generator?
We switched from ghc 9.2 to 9.4 before the 3.0 release. It would tell us something, then, if 3.x versions had this problem but 2.x versions did not.
We started building the linux binaries on cirrus (instead of GH actions) before the 3.1.2 release. So if there were a difference between 3.1.1 and 3.1.2, that would also tell us something.
Do you mean that this version of ghc was compiled using the llvm backend?
Yes.
(That shouldn't affect its behavior when run, should it?) Or that, when this ghc is used, it defaults to using the llvm backend rather than the native code generator?
(No.) No, it is enabled via the -fllvm
flag.
ℹ️ That is my understanding from reading the manual and the Opinion piece on GHC backends..
@AndreasPK Please confirm.
if there were a difference between 3.1.1 and 3.1.2, that would also tell us something.
No difference that I can notice between 3.1.1 and 3.1.2
if 3.x versions had this problem but 2.x versions did not.
Success! With the 2.19.2 binary everything works, even complex translations to PDF and LaTeX preamble!
I'm going to try to make a new version of 3.1.5 that uses ghc 9.2 and let's see if that changes anything.
I'm going to try to make a new version of 3.1.5 that uses ghc 9.2 and let's see if that changes anything.
@jgm I you are using tag 9.2
this will use GHC version 9.2.8
(source released: 2023-05-26; image built: 2023-05-27).
ℹ️ Pandoc v2.19.2 was built using image glcr.b-data.ch/ghc/ghc-musl:9.2.3
(source released: 2022-05-27, image re-built: 2022-07-29).
This one is built with ghc 9.2.5 (before I saw your message): https://cirrus-ci.com/task/4563975971536896
Note: in addition to using a different ghc version, this uses a different text version. The version of text that comes bundled with ghc is < 2 in ghc 9.2 and > 2 in ghc 9.4. So, if this version does not cause the problem, that could point to either something in ghc 9.4.4 or something in text 2. If this build is a success, I can try another build with ghc 9.2 and text > 2.
For completeness, here's a version built with ghc 9.2.5 and text 2.0.2: https://cirrus-ci.com/task/5450494399741952
I can confirm that the issue seems to be resolved with both binaries
@AndreasPK this might be of interest. We only get the problem when compiling with ghc >= 9.4. With 9.2, it goes away. Same compiled code, using mostly the same dependent library versions. Here are the differences I noted:
--- libs-925 2023-07-17 22:19:24.000000000 -0700
+++ libs-944 2023-07-17 22:19:14.000000000 -0700
@@ -60,7 +60,6 @@
- - data-array-byte-0.1.0.1 (lib) (requires download & build)
- - digest-0.0.1.7 (lib) (requires download & build)
+ - digest-0.0.1.3 (lib) (requires download & build)
- - toml-parser-1.3.0.0 (lib) (requires download & build)
+ - toml-parser-1.2.1.0 (lib) (requires download & build)
Neither the toml-parser nor digest would be used in the basic commands that cause the error.
I can confirm that the issue seems to be resolved with both binaries
@jgm Thus, to maintain compatibility with older machines, pandoc should be built with glcr.b-data.ch/ghc/ghc-musl:9.2.x
.
Yes, I can make that change for now, til we diagnose this properly.
We only get the problem when compiling with ghc >= 9.4. With 9.2, it goes away.
I migrated from the make-based to the Hadrian build system with GHC v9.4.1.
ℹ️ I am still building GHC v9.2.x with the make
-based build system, though.
@AndreasPK Do I somehow misconfigure the Hadrian build?
Cross reference: https://www.haskell.org/ghc/blog/20220805-make-to-hadrian.html
(I don't think this is causing the issue. Only mentioning it for the sake of completeness.)
@AndreasPK another data point: There was a similar issue on Windows (#8955). I switched from ghc 9.4 to ghc 9.2 and the problem went away. The build with 9.4 was using the binary downloaded by stack, and the build with 9.2 was using the binary from ghcup. So I don't think this has anything to do with the specific way in which ghc-musl was built.
@freijon Does pandoc 3.1.6 work as expected on your old machine?
@AndreasPK Thank you for further insights from your side on
Could anyone who can reproduce this try to run pandoc under gdb to get a backtrace?
Alternatively if someone can give me step-by-step instructions which allow to reproduce this I might be able to do so myself depending on the requirements.
I downloaded the release in question and I can see the instruction in it (although my machine does support it). However it seems the release is naturally stripped of all symbols so that wasn't as informative as I had hoped.
I built pandoc myself and just grepped for the instruction in the assembly.
This seems to come from the function _hs_bytestring_long_long_uint_hex
which is part of bytestring.
It function has been there for "forever" and doesn't explicitly use simd. Rather it seems auto vectorization triggers:
// unsigned long ints (64 bit words)
char* _hs_bytestring_long_long_uint_hex (long long unsigned int x, char* buf) {
// write hex representation in reverse order
char c, *ptr = buf, *next_free;
do {
*ptr++ = digits[x & 0xf];
x >>= 4;
} while ( x );
// invert written digits
next_free = ptr--;
while(buf < ptr) {
c = *ptr;
*ptr-- = *buf;
*buf++ = c;
}
return next_free;
};
So it comes down to whatever flags the version of bytestring pandoc is linked against has been built with.
https://gitlab.haskell.org/ghc/ghc/-/issues/23718
~I can confirm it's an upstream issue. The libraries shipping with ghc seem to have avx enabled.~
Edit: At the very least there are avx instructions in the binary which, on my mache, get executed. However I also have an avx cpu and there seem to be runtime checks. So that's not necessarily wrong.
Downstream bug (Gentoo): https://bugs.gentoo.org/910183 - Gentoo uses the binary provided in the released tar.gz
I installed pandoc on my VM. When I use the following command, I get the following error:
Command:
pandoc --pdf-engine=lualatex -H <preamble-file.tex> <input-file.md> -o <output-file.pdf>
Error:
Here some additional information:
Output of
resolve-march-native
:Versions tried:
After some initial debugging with
gdb
, I found:This indicates that the binary appears to be using AVX which isn't available on all 64-bit x86 CPUs