nextstrain / cli

The Nextstrain command-line interface (CLI)—a program called nextstrain—which aims to provide a consistent way to run and visualize pathogen builds and access Nextstrain components like Augur and Auspice across computing environments such as Docker, Conda, and AWS Batch.
https://docs.nextstrain.org/projects/cli/
MIT License
27 stars 20 forks source link

ENH: Print docker image used at beginning of aws-batch jobs #254

Open corneliusroemer opened 1 year ago

corneliusroemer commented 1 year ago

I was trying to see why a bug happened in a recent monkeypox CI run. I couldn't reproduce locally using the latest docker-base image.

I couldn't figure out which docker image was used for the run on aws-batch. Would it be possible to print the docker image tag at the beginning of a run so it can be found in logs?

corneliusroemer commented 1 year ago

Actually I found the docker tag in the job description. But maybe there's still a use to have it in the logs.

Also, it might be neat to have versions of the major commands we've installed in the logs as well: things like augur commit, biopython version etc.

Something like pip list could be super helpful when debugging.

tsibley commented 1 year ago

FWIW, the default image is shown in the output of nextstrain build --help-all under the --image option's description, e.g.:

  --image <image>       Container image name to use for the Nextstrain runtime
                        (default: nextstrain/base:build-20230228T200714Z for
                        Docker and AWS Batch,
                        docker://nextstrain/base:build-20230119T003940Z for
                        Singularity)

The values shown there will be specific to each setup.

Versions of the major first-party tools in our runtimes are reported by nextstrain version --verbose, e.g.:

nextstrain.cli 6.2.0

Python
  /home/tom/.nextstrain/cli-standalone/nextstrain
  3.10.9 (main, Dec 21 2022, 04:02:04) [Clang 14.0.3 ]

Runners
  docker (default)
    nextstrain/base:build-20230228T200714Z (caefc495d14d, 2023-02-28 13:13:18 -0800 PST)
    augur 21.0.1
    auspice v2.43.0
    fauna a2a1907
    sacra not present

  conda 
    nextstrain-base 20230208T180335Z (hb0f4dca_1_locked, nextstrain)
    augur 21.0.0
    auspice 2.43.0

  singularity 
    docker://nextstrain/base:build-20230119T003940Z (/home/tom/.nextstrain/runtimes/singularity/images/nextstrain/base/build-20230119T003940Z.sif)
    augur 19.3.0
    auspice v2.42.0
    fauna 6d1cede
    sacra not present

  ambient 
    unknown

  aws-batch 
    unknown

For AWS Batch, granted, you have to know that it uses the same Docker image and thus will have the same versions. (The recent bug with amd64 vs. arm64 notwithstanding.)

As for more general, broad debugging information, it maybe seems appropriate for unattended builds (such as those launched with --aws-batch --detach), but pretty log-cluttering for attended builds, esp. non-AWS Batch builds. This seems to me more like debugging information we could choose to emit from the workflow itself (e.g. closest to the place the commands like augur are used), rather than in Nextstrain CLI's runners. Since workflow engines may augment our runtime with their own environments, this would be less potentially misleading.