StaPH-B / docker-builds

:package: :whale: Dockerfiles and documentation on tools for public health bioinformatics
GNU General Public License v3.0
188 stars 119 forks source link

[Bug]: phyml error #589

Closed kapsakcj closed 1 year ago

kapsakcj commented 1 year ago

What container were you trying to use, and how were you attempting to use it?

CC @jvhagey - have you encountered this error before?

When using staphb/phyml:3.3.20220408 docker image, I got an error Illegal instruction (core dumped)

The error seems related to CPU instructions and the phyml GH Issues is littered with reports of this error: https://github.com/stephaneguindon/phyml/issues?q=is%3Aissue+core+dump

perhaps the dockerfile/image is missing openmpi? The bioconda recipe includes it. We could try adding apt-get install openmpi-bin to see if that resolves the issue

To reproduce (input file attached):

# start container in interactive mode
$ docker run --rm=True -u $(id -u):$(id -g) -v $(pwd):/data -ti staphb/phyml:3.3.20220408

$ phyml -i snvAlignment.phy --datatype nt --model GTR -v 0.0 -s BEST --ts/tv e --nclasses 4 --alpha e --bootstrap -4 --quiet
Illegal instruction (core dumped)

When I install via bioconda (albeit an earlier version of phyml), I do not receive this error. It runs as expected:

$ mamba create -n phyml -c conda-forge -c bioconda -c defaults phyml

$ phyml --version

. This is PhyML version 3.3.20211231.

$ phyml -i snvAlignment.phy --datatype nt --model GTR -v 0.0 -s BEST --ts/tv e --nclasses 4 --alpha e --bootstrap -4 --quiet

. The BEST option is deprecated. PhyML now uses a mix of SPRs and NNIs.

  ////////////////////////////////////.\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
  \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.//////////////////////////////////////////

        . Sequence filename:                             snvAlignment.phy
        . Data type:                                     dna
        . Alphabet size:                                 4
        . Sequence format:                               interleaved
        . Number of data sets:                           1
        . Nb of bootstrapped data sets:                  0
        . Compute approximate likelihood ratio test:     yes (SH-like branch supports)
        . Model name:                                    GTR
        . Proportion of invariable sites:                0.000000
        . RAS model:                                     discrete Gamma
        . Number of subst. rate catgs:                   4
        . Gamma distribution parameter:                  estimated
        . 'Middle' of each rate class:                   mean
        . Nucleotide equilibrium frequencies:            empirical
        . Optimise tree topology:                        yes
        . Starting tree:                                 BioNJ
        . Add random input tree:                         no
        . Optimise branch lengths:                       yes
        . Minimum length of an edge:                     1e-08
        . Optimise substitution model parameters:        yes
        . Run ID:                                        none
        . Random seed:                                   1675895550
        . Subtree patterns aliasing:                     no
        . Version:                                       3.3.20211231
        . Byte alignment:                                16
        . AVX enabled:                                   no
        . SSE enabled:                                   yes

  ////////////////////////////////////.\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
  \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.//////////////////////////////////////////

. Log likelihood of the current tree: -373.930670424939421536692.

. Time used 0h0m0s

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

Relevant log output

No response

kapsakcj commented 1 year ago

test input file:

snvAlignment.phy.txt

Remove the .txt off the end

jvhagey commented 1 year ago

Yes, unfortunately its been on my to do list for quite some time... https://github.com/DHQP/SNVPhyl_Nextflow/issues/1 . Alyssa looked into it, but I never got to getting it a fix. 😬

kapsakcj commented 1 year ago

Understood. Some additional info that may help troubleshoot the issue - in addition to the bioconda package running successfully, the biocontainer for the corresponding bioconda package resolved the issue for P.

I believe he used the docker image quay.io/biocontainers/phyml:3.3.20211231--hee9e358_0 from here: https://quay.io/repository/biocontainers/phyml?tab=tags

kapsakcj commented 1 year ago

OK, I think I may have resolved the issue. Not sure exactly what resolved it 😅 , but I'll show the steps I took.

So, with the current staphb docker image, it does still fail.

# start container in interactive mode
$ docker run --rm=True -u $(id -u):$(id -g) -v $(pwd):/data -ti staphb/phyml:3.3.20220408
# phyml fails w same error
I have no name!@072c7c75a406:/data$ phyml -i /data/snvAlignment.phy --datatype nt --model GTR -v 0.0 -s BEST --ts/tv e --nclasses 4 --alpha e --bootstrap -4 --quiet
Illegal instruction (core dumped)

But when I take the existing dockerfile and make one change: unpin ca-certificates version since the version is no longer available. Line 21 changed from ca-certificates=20210119~20.04.2 \ to ca-certificates \, re-build the docker image, and re-try the test command, it runs successfully.

# rebuild the image with the updated dockerfile
$ docker build -t kapsakcj/phyml:latest phyml/3.3.20220408/

# start updated docker container in interactive mode
$ docker run --rm=True -u $(id -u):$(id -g) -v $(pwd):/data -ti kapsakcj/phyml:latest

# re-test runs successfully
I have no name!@b36696a06bc8:/data$ phyml -i /data/snvAlignment.phy --datatype nt --model GTR -v 0.0 -s BEST --ts/tv e --nclasses 4 --alpha e --bootstrap -4 --quiet

. The BEST option is deprecated. PhyML now uses a mix of SPRs and NNIs.

  ////////////////////////////////////.\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
  \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.//////////////////////////////////////////

        . Sequence filename:                             snvAlignment.phy
        . Data type:                                     dna
        . Alphabet size:                                 4
        . Sequence format:                               interleaved
        . Number of data sets:                           1
        . Nb of bootstrapped data sets:                  0
        . Compute approximate likelihood ratio test:     yes (SH-like branch supports)
        . Model name:                                    GTR
        . Proportion of invariable sites:                0.000000
        . RAS model:                                     discrete Gamma
        . Number of subst. rate catgs:                   4
        . Gamma distribution parameter:                  estimated
        . 'Middle' of each rate class:                   mean
        . Nucleotide equilibrium frequencies:            empirical
        . Optimise tree topology:                        yes
        . Starting tree:                                 BioNJ
        . Add random input tree:                         no
        . Optimise branch lengths:                       yes
        . Minimum length of an edge:                     1e-08
        . Optimise substitution model parameters:        yes
        . Run ID:                                        none
        . Random seed:                                   1676050856
        . Subtree patterns aliasing:                     no
        . Version:                                       3.3.20220408
        . Byte alignment:                                32
        . AVX enabled:                                   yes
        . SSE enabled:                                   yes

  ////////////////////////////////////.\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
  \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.//////////////////////////////////////////

. Log likelihood of the current tree: -373.930471269682129786815.

. Time used 0h0m0s

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

I'm happy to open a PR for this slight change, but it would be good to have others test this beforehand.

I've pushed the image to my personal dockerhub repo in case anyone else is able to test and confirm it works for them.

kapsakcj/phyml:latest available here: https://hub.docker.com/r/kapsakcj/phyml/tags

jvhagey commented 1 year ago

ok nice, I will try it today and see if it works. Thanks for following up on this!

jvhagey commented 1 year ago

@kapsakcj, so oddly enough on one of our clusters the old container works, but that doesn't work on the other cluster. The new container works on both so I say lets go with the new one. Do you want to push the changes? Thanks for getting it sorted!!

kapsakcj commented 1 year ago

great, thanks for testing! I'll ping back when the docker image tag has been overwritten on dockerhub & quay

kapsakcj commented 1 year ago

OK, docker image has been rebuilt locally on my VM and pushed to dockerhub and quay for 4 tags:

quay.io/staphb/phyml:3.3.20220408 quay.io/staphb/phyml:latest staphb/phyml:3.3.20220408 staphb/phyml:latest

please let us know if you run into issues again and we can troubleshoot further