grailbio / base

A collection of Go utility packages used by GRAIL's tools
Apache License 2.0
86 stars 24 forks source link

fatbin: get fatbin tests to pass outside of grail #8

Closed cosnicolaou closed 4 years ago

cosnicolaou commented 4 years ago

Get the fatbin tests to run outside of GRAIL. I presume that bazel is configured to build fatbin binaries for these tests. Of course, outside of bazel/GRAIL, that external environment is not met. This PR changes things. It also seems that, despite words to the contrary, it's not possible to use go build directly on linux systems. This appears to be because fatbin.Self() always assumes that it's a fatbin binary, again, I suspect that's a bazel assumption inside of GRAIL. I will address that in a followup PR, but for now, this lays the foundation for that.

mariusae commented 4 years ago

The fatbin tests are not meant to assume this; I can get it to work locally with both linux/amd64 and darwin/amd64 on both go1.12.x and go.1.13.x (just using go test).

Also fatbin.Self() is not meant to assume that the binary is built with fatbin: it would simply return an offset that's at the end of the file, and thus no additional binaries.

What is the error you're seeing?

mariusae commented 4 years ago

Also fatbin.Self() is not meant to assume that the binary is built with fatbin: it would simply return an offset that's at the end of the file, and thus no additional binaries.

This is a roundabout way of saying that perhaps there is a bug elsewhere; that fatbin does not always correctly find the end of the binary. We could make this more robust by changing the fatbin format to store some magic and an offset at the very end of the binary, so that we are not so dependent on the specifics of the host binary format.

What's the go version and GOOS/GOARCH you're running into issues?

cosnicolaou commented 4 years ago

ubuntu 18, amd64. $ go version go version go1.13.1 linux/amd64

I first saw the error when running bigslice, which confused me for a while since I was reading a gzip file in my app, but it turns out that the error was coming from bigmachine itself.

I don't think there's any guarantee that the file size + offset are the same, and yes, this all does hint at some externality changing. If you're not using bazel to build fat binaries for these tests, then I agree that the offset is the problem. My fix is just a hack really. Probably the best approach is to write a custom elf section for the fat binary if that's possible.

$ go test -v === RUN TestFatbin --- FAIL: TestFatbin (0.00s) fatbin_test.go:28: zip: not a valid zip file === RUN TestFatbinNonExist --- FAIL: TestFatbinNonExist (0.00s) fatbin_test.go:47: zip: not a valid zip file === RUN TestSniff --- FAIL: TestSniff (0.00s) fatbin_test.go:86: got 2666560, want 2670592 === RUN TestLinuxElf

mariusae commented 4 years ago

I have an idea for a much more robust layout. I’ll send out a change on Monday.

mariusae commented 4 years ago

Thinking something like #9: @cosnicolaou let me know what you think.

mariusae commented 4 years ago

This was pushed as 71f2ad7