sourcegraph / scip-python

SCIP indexer for Python
Other
51 stars 23 forks source link

Include the scip binary in Docker image(s) #54

Open nejch opened 2 years ago

nejch commented 2 years ago

GitLab currently still only supports lsif for code intelligence (https://docs.gitlab.com/ee/user/project/code_intelligence.html) and not sure how long it'll take to include SCIP in their parser - there's currently no issues on their side to track this.

Since scip is likely just a small go binary it should be easy to include it by fetching it in the Dockerfile. People would then be able to easily create SCIP and LSIF dumps without the overhead of multiple CI jobs/images. I'd be happy to start a PR if you're open to this after #52 gets merged.

This would likely depend on https://github.com/sourcegraph/scip/issues/4, so pinging @varungandhi-src if this is something you'd consider? I can also try to open PRs on that side to get something going with goreleaser or so :bow:

varungandhi-src commented 2 years ago

A PR to add binary releases for the scip binary would be great. However, IME it takes a bit of iteration to get it exactly right, which might be cumbersome for you since you don't have commit/workflow access. (If you're game though, I can review the PR and kick off the workflow/create a tag.)

As for adding a scip-cli binary to the Docker image, could you clarify how that would help you? As I understand it, you'd need to invoke the SCIP->LSIF conversion step in your CI pipeline anyways, so it seems like having an extra step for wget <scip-binary-url> && chmod +x scip, which doesn't seem like a big jump in complexity. (I'm not sure why you say it would require the overhead of multiple CI jobs/images. Maybe there is a terminology difference in jobs vs steps in GitLab CI vs GitHub Actions?)

nejch commented 2 years ago

Thanks @varungandhi-src for the quick reply! I'll take a look.

Regarding including the binary in scip-* images: my intention is to add CI templates for code intelligence (akin to reusable GitHub actions) in upstream gitlab similar to the existing LSIF templates: https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Jobs/Code-Intelligence.gitlab-ci.yml

Usually these types of jobs would run in self-contained images without constantly installing dependencies in CI so I wasn't sure if they'd be willing to include the extra steps, in something that is part of the core templates and would likely be run very frequently in CI. But that might be a compromise, I didn't realize most of the scip images already come with curl/wget so I wouldn't need the whole "update & install curl & install deps & chmod" chain every time.

alsmnn commented 6 months ago

Unfortunately it is not enough to just install the binaries and chmod +x scip in the docker container as mentioned in the (release-page). If you try to run the scip binary you get the following error message:

bash: ./scip: cannot execute: required file not found

Inspecting the binary with file outside the docker container results in the following:

$ file scip
scip: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=9HW9eA7PhPVOzwGywxNw/qS9Iso6IMtRXoYX7fcEI/BaTl3DWOAON9G-jBfS6F/7tR11WhqVwI0ZW7Cv1b8, with debug_info, not stripped

The interesting part is the interpreter: interpreter /lib64/ld-linux-x86-64.so.2 The docker images from sourcegraph, e.g. sourcegraph/scip-python:autoindex are missing /lib64/ entirely.

docker container run -it --rm -v ${PWD}:/data sourcegraph/scip-python:autoindex /bin/bash
860f8ade6e7a:/# ls -al /
total 68
drwxr-xr-x    1 root     root          4096 Mar  7 09:57 .
drwxr-xr-x    1 root     root          4096 Mar  7 09:57 ..
-rwxr-xr-x    1 root     root             0 Mar  7 09:57 .dockerenv
drwxr-xr-x    1 root     root          4096 Nov 22 10:00 bin
drwxr-xr-x    7 pn       1002          4096 Mar  7 09:26 data
drwxr-xr-x    5 root     root           360 Mar  7 09:57 dev
drwxr-xr-x    1 root     root          4096 Mar  7 09:57 etc
drwxr-xr-x    1 root     root          4096 Apr  6  2023 home
drwxr-xr-x    1 root     root          4096 Apr  6  2023 lib
drwxr-xr-x    5 root     root          4096 Mar 29  2023 media
drwxr-xr-x    2 root     root          4096 Mar 29  2023 mnt
drwxr-xr-x    1 root     root          4096 Apr  6  2023 opt
dr-xr-xr-x  642 root     root             0 Mar  7 09:57 proc
drwx------    1 root     root          4096 Apr  6  2023 root
drwxr-xr-x    2 root     root          4096 Mar 29  2023 run
drwxr-xr-x    2 root     root          4096 Mar 29  2023 sbin
drwxr-xr-x    2 root     root          4096 Mar 29  2023 srv
dr-xr-xr-x   13 root     root             0 Mar  7 09:30 sys
drwxrwxrwt    1 root     root          4096 Apr  6  2023 tmp
drwxr-xr-x    1 root     root          4096 Nov 22 10:00 usr
drwxr-xr-x    1 root     root          4096 Nov 22 10:00 var

So you are not able to run these binaries inside the language specific scip containers. Installing scip from source needs go, so you would end up with extra dependencies. Maybe there is another way to specify the interpreter for the scip binary.

nejch commented 6 months ago

@alsmnn yeah I think I also hit this issue and back when I was looking into this :) I haven't really explored alternatives and changing base images would probably take some effort.

bewing commented 4 months ago

Does compiling scip as a static binary not solve this?

$ CGO_ENABLED=0 go build ./cmd/scip
go: downloading github.com/sourcegraph/sourcegraph/lib v0.0.0-20220511160847-5a43d3ea24eb
go: downloading github.com/hhatto/gocloc v0.4.2
go: downloading github.com/hexops/gotextdiff v1.0.3
go: downloading github.com/montanaflynn/stats v0.7.1
go: downloading github.com/k0kubun/pp/v3 v3.1.0
go: downloading golang.org/x/text v0.12.0
go: downloading github.com/go-enry/go-enry/v2 v2.7.2
go: downloading github.com/cockroachdb/errors v1.8.9
go: downloading github.com/cockroachdb/redact v1.1.3
go: downloading github.com/rogpeppe/go-internal v1.10.0
go: downloading github.com/getsentry/sentry-go v0.12.0
go: downloading github.com/cockroachdb/logtags v0.0.0-20211118104740-dabe8e521a4f
$ ldd scip
        not a dynamic executable
$ ls -lah scip
.rwxr-xr-x bewing bewing 22 MB Wed May  8 13:42:41 2024  scip
$ ./scip
NAME:
   scip - SCIP Code Intelligence Protocol CLI

USAGE:
   scip [global options] command [command options] [arguments...]

VERSION:
   v0.3.3-dev
SHA: 6495bfbd33671ccd4a2358505fdf30058140ff32
timestamp: 2024-05-03T07:57:09Z
clean: true

DESCRIPTION:
   For more details, see the project README at:

     https://github.com/sourcegraph/scip

COMMANDS:
   convert   Convert a SCIP index to an LSIF index
   lint      Flag potential issues with a SCIP index
   print     Print a SCIP index for debugging
   snapshot  Generate snapshot files for golden testing
   stats     Output useful statistics about a SCIP index
   help, h   Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --help, -h     show help
   --version, -v  print the version

Example Docker build:

FROM golang:1.19 as build
WORKDIR /srv
RUN git clone https://github.com/sourcegraph/scip
WORKDIR /srv/scip

RUN CGO_ENABLED=0 go build ./cmd/scip

FROM sourcegraph/scip-python:v0.6.0
COPY --from=build /srv/scip/scip /usr/local/bin/scip
RUN chmod a+x /usr/local/bin/scip
firelizzard18 commented 4 months ago

Compiling as a static binary (@bewing) solved the problem for me.

As a general rule, if you're going to run a Go program in a container, compiling with CGO_ENABLED=0 is a good idea because of libc portability issues like this. Most distros (including the upstream golang image) use glibc, but many containers use lower-footprint alternatives. Disabling CGO completely circumvents that issue.