google-research / dex-lang

Research language for array processing in the Haskell/ML family
BSD 3-Clause "New" or "Revised" License
1.57k stars 106 forks source link

Error compiling at cryptohash #661

Closed tjpalmer closed 2 years ago

tjpalmer commented 2 years ago

Hello! I get errors building dex when it gets to cryptohash. There are more lines of error, but they're somewhat similar to this:

cryptohash           > /tmp/stack-aed72882310515c2/cryptohash-0.11.9/cbits/sha512.c:235:29: error:
cryptohash           >      warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 4 [-Wformat-overflow=]
cryptohash           >       235 |   i = sprintf(buf, "SHA-512/%d", t);
cryptohash           >           |                             ^~
cryptohash           >     |                                  
cryptohash           > 235 |                 i = sprintf(buf, "SHA-512/%d", t);
cryptohash           >     |                             ^    

If I look at the cryptohash repo, it says it's been replaced by cryptonite, so I doubt either project will help much. And maybe this is a transitive dependency? I'm having trouble figuring out who in the dependency chain is choosing cryptohash.

And as an aside, the lines in cryptohash source and cryptonite source are fairly similar, so I suspect the issue would persist there.

I'm also unsure which compiler it's trying to use, but here are my system defaults:

$ gcc --version
gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
...
$ clang --version
clang version 10.0.0 (git@github.com:apple/llvm-project.git cf0195a2f45d22e5e4487152f78fc077e5100dae)
Target: x86_64-unknown-linux-gnu
Thread model: posix
...

That clang version is in a swift language build path. I'll look into that matter, but I doubt that's the matter.

Anyway, I'm a beginner in Haskell, so I'm not sure how to dig out some of the details of what's going on. Upshot is that I can't build dex right now. Any workarounds or ways to dig deeper into and/or change compiler warning/error settings or know to whom I can file an issue to get them onto cryptonite so I can ask them if the issue still persists, etc?

Thanks!

apaszke commented 2 years ago

Interesting, I haven't seen this. Not sure why is this warning causing a compilation failure. Perhaps cryptohash always compiles with -Werror? In any case, a workaround until this is fixed upstream might be to downgrade to an older version of gcc that doesn't have this particular diagnostic. The one you have is very recent which is why this probably isn't a common problem yet.

tjpalmer commented 2 years ago

Thanks for the recommendation. I remove the custom Swift build from the path among other possible changes. Now the dex build can't find ld. So I tried building a C project, which works. And I tried bulding a different Haskell project with cabal build. It also failed to find ld, so I nuked ghcup then reinstalled it. And the other Haskell project now builds, but running make for Dex still fails:

llvm-hs         > <no location info>: error:
llvm-hs         >     Warning: Couldn't figure out linker information!
llvm-hs         >              Make sure you're using GNU ld, GNU gold or the built in OS X linker, etc.
llvm-hs         > collect2: fatal error: cannot find ‘ld’
llvm-hs         > compilation terminated.
llvm-hs         > `gcc' failed in phase `Linker'. (Exit code: 1)

Same errors for th-utilities and mono-traversable.

So now I've detoured from the original report, but any ideas here?

apaszke commented 2 years ago

That's... really weird. Does which ld work for you?

tjpalmer commented 2 years ago
$ which ld
/usr/bin/ld

I'm going to try building in a controlled container and see what happens.

tjpalmer commented 2 years ago

Tried building based on docker.io/nvidia/cuda:11.2.2-cudnn8-devel-ubuntu20.04 which seems to have gcc 9. Gets the cryptohash error.

tjpalmer commented 2 years ago

What version of gcc do you have?

And here's the failing Containerfile for reference.

tjpalmer commented 2 years ago

Basing on Ubuntu 18.04, which has gcc 7.5 seems to build successfully. One or more tests fail, but I'm also not connected to actual cuda on the host computer at the moment, I don't think. But a few tests pass, whatever they do.

apaszke commented 2 years ago

Wow, I haven't heard about anyone having this much trouble, sorry for that. I just checked and I actually have GCC 9.3 on my machine, so I don't understand the issue. It's a pretty standard Ubuntu 20.04 installation and everything seems to work out of the box?

About the failing tests, I think that the tutorial might fail if you don't download some files, but all other tests should just work.

tjpalmer commented 2 years ago

It's possible I made other mistakes, too. In my 20.04-based image, I didn't install pkg-config either, and that was another error, but the cryptohash errors were there as well on 20.04 and not on 18.04. In any case, I have an existence proof of something that builds.

And interestingly, the build logs also have cryptonite, so I'm guessing different dependencies have different transitive dependencies. But I haven't dug up how to show that dependency tree. And cryptonite doesn't have the same errors that cryptohash does, even though the code looks similar. Maybe they change the compiler configuration.

Feel free to close this issue or keep it open for future reference, however you prefer. But either way, if this comes up again later, it might be worth digging into which dependency down the chain is using cryptohash and seeing if there's some way to push them to switch over.

What's up with the ld thing on my own machine, I don't know. If I find out more details on that, and there's something actionable, I'll open that as a separate issue.

Meanwhile, thanks for being responsive!

apaszke commented 2 years ago

I tracked down the cryptohash dependency to the store package, so they're the one adding it to our transitive dependencies. Sadly we depend on store pretty heavily so it's not easy to slice it out at the moment. I guess we'll wait it out for now and if we get any more similar reports then we'll have to consider fixing this upstream or using another package.

In any case, I'm keeping my fingers crossed and hope you'll figure out the ld situation. I'll close this issue for now, but feel free to reopen if anything comes up.