psychoinformatics-de / paper-remodnav

Code, data and manuscript for https://doi.org/10.1101/619254
Creative Commons Attribution 4.0 International
4 stars 2 forks source link

Dockerizing the analysis #20

Closed mih closed 11 months ago

mih commented 11 months ago

TL;DR Versions did not matter, time did not matter, but it matters who compiles numpy

Five years after we did this analysis, I am trying to compile a docker-based environment. I can success building the stats and figures in a wide variety of configurations. However, there are small differences.

I am collecting some notes here, trying to narrow down on a setup the reproduces the stats exeactly:

Debian buster PY3.7

FROM debian:buster-slim
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y inkscape latexmk texlive-latex-extra python3-pip make
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev cython3 python3-setuptools python3-wheel
RUN apt-get install -q --no-install-recommends -y python3-numpy
RUN apt-get install -q --no-install-recommends -y python3-scipy
RUN apt-get install -q --no-install-recommends -y python3-pandas
RUN apt-get install -q --no-install-recommends -y python3-sklearn
RUN apt-get install -q --no-install-recommends -y python3-statsmodels
RUN apt-get clean
RUN python3 -m pip install -v --no-build-isolation --prefer-binary seaborn==0.10.1 scikit-learn matplotlib==3.4.3

pip freeze gives

cycler==0.11.0
Cython==0.29.2
decorator==4.3.0
joblib==0.13.0
kiwisolver==1.4.5
matplotlib==3.4.3
numpy==1.16.2
pandas==0.23.3+dfsg
patsy==0.5.0+dev
Pillow==8.3.2
pyparsing==3.1.1
python-dateutil==2.7.3
pytz==2019.1
scikit-learn==0.20.2
scipy==1.1.0
seaborn==0.10.1
six==1.12.0
statsmodels==0.8.0
typing-extensions==4.7.1

after running the analysis, the following diff occurs

modified:   img/confusion_MN_AL.svg
modified:   img/confusion_RA_AL.svg
modified:   img/remodnav_lab.svg
modified:   img/remodnav_mri.svg
modified:   results_def.tex
\newcommand{\videoMNALMclfWOP}{7.9}\newcommand{\videoMNALMclfWOP}{8.1}
\newcommand{\videoMNALFIXcod}{36}\newcommand{\videoMNALFIXcod}{37}
\newcommand{\dotsRAALMclfWOP}{10.8}\newcommand{\dotsRAALMclfWOP}{10.9}
\newcommand{\videoRAALMCLF}{28.5}\newcommand{\videoRAALMCLF}{28.6}
\newcommand{\maxmclf}{10.8}\newcommand{\maxmclf}{10.9}
\newcommand{\FIXvideomnRE}{147}\newcommand{\FIXvideomnRE}{146}
\newcommand{\FIXvideonoRE}{144}\newcommand{\FIXvideonoRE}{145}
\newcommand{\PURvideomnRE}{314}\newcommand{\PURvideomnRE}{313}
\newcommand{\rankFIXvideoIHMM}{6}\newcommand{\rankFIXvideoIHMM}{5}
\newcommand{\rankFIXvideoRE}{5}\newcommand{\rankFIXvideoRE}{6}

Debian bullseye PY3.9

FROM debian:bullseye-slim
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y inkscape latexmk texlive-latex-extra python3-pip make
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev cython3 python3-setuptools python3-wheel
RUN apt-get install -q --no-install-recommends -y python3-numpy
RUN apt-get install -q --no-install-recommends -y python3-scipy
RUN apt-get install -q --no-install-recommends -y python3-pandas
RUN apt-get install -q --no-install-recommends -y python3-sklearn
RUN apt-get install -q --no-install-recommends -y python3-statsmodels
RUN apt-get clean
RUN python3 -m pip install -v --no-build-isolation --prefer-binary seaborn==0.10.1 scikit-learn matplotlib==3.4.3

pip freeze gives

cycler==0.12.1
Cython==0.29.21
decorator==4.4.2
joblib==0.17.0
kiwisolver==1.4.5
matplotlib==3.4.3
numpy==1.19.5
packaging==23.2
pandas==1.1.5
patsy==0.5.3
Pillow==10.0.1
pyparsing==3.1.1
python-dateutil==2.8.1
pytz==2021.1
-e remodnav==1.0
scikit-learn==0.23.2
scipy==1.6.0
seaborn==0.10.1
six==1.16.0
statsmodels==0.14.0

The diff of the statistical scores is identical compared to the bullseye container. Also the same SVG are modified (also looks identical inside).

Ubuntu focal PY3.8

FROM ubuntu:focal
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y inkscape latexmk texlive-latex-extra python3-pip make
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev cython3 python3-setuptools python3-wheel
RUN apt-get install -q --no-install-recommends -y python3-numpy
RUN apt-get install -q --no-install-recommends -y python3-scipy
RUN apt-get install -q --no-install-recommends -y python3-pandas
RUN apt-get install -q --no-install-recommends -y python3-sklearn
RUN apt-get install -q --no-install-recommends -y python3-statsmodels
RUN apt-get clean
RUN python3 -m pip install -v --no-build-isolation --prefer-binary seaborn==0.10.1 scikit-learn matplotlib==3.4.3

pip freeze gives

cycler==0.12.1
Cython==0.29.14
decorator==4.4.2
joblib==0.14.0
kiwisolver==1.4.5
matplotlib==3.4.3
numpy==1.17.4
pandas==0.25.3
patsy==0.5.1
Pillow==10.0.1
pyparsing==3.1.1
python-dateutil==2.7.3
pytz==2019.3
-e remodnav==1.0
scikit-learn==0.22.2.post1
scipy==1.3.3
seaborn==0.10.1
six==1.14.0
statsmodels==0.11.1

The diff of the statistical scores is identical compared to the bullseye and buster containers. Also the same SVG are modified (also looks identical inside).

Conclusions

adswa commented 11 months ago

reproduced the same diff as you with the python3.7 Docker image

mih commented 11 months ago

For comparison: trying with a virtualenv, trying to go with whatever latest version that is still API compatible with the code.

$ virtualenv --python="$(which python3)" ${HOME}/env/remodnav-repro
$ . ~/env/remodnav-repro/bin/activate
$ python -m pip install numpy scipy pandas==1.5.3 seaborn scikit-learn matplotlib==3.4.3

The previously pinned seaborn 0.10.1 is incompatible with numpy 1.26, and had to be unpinned.

...
  File "/home/mih/env/remodnav-repro/lib/python3.11/site-packages/numpy/__init__.py", line 324, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'bool'.
`np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'bool_'?

Pandas had to be pinned to the last 1.x release. Pandas 2.1.1 incompatibility:

ValueError: Multi-dimensional indexing (e.g. `obj[:, None]`) is no longer supported. Convert to a numpy array before indexing instead.

pip freeze gives

cycler==0.12.1
joblib==1.3.2
kiwisolver==1.4.5
matplotlib==3.4.3
numpy==1.26.0
pandas==1.5.3
Pillow==10.0.1
pyparsing==3.1.1
python-dateutil==2.8.2
pytz==2023.3.post1
scikit-learn==1.3.1
scipy==1.11.3
seaborn==0.13.0
six==1.16.0
threadpoolctl==3.2.0

This REPRODUCES all stats exactly!!!

The remaining diff is in the SVGs

img/confusion_MN_AL.svg  | 24 ++++++++++++------------
img/confusion_MN_RA.svg  | 24 ++++++++++++------------
img/confusion_RA_AL.svg  | 24 ++++++++++++------------
img/hist_saccade_lab.svg |  8 ++++----
mih commented 11 months ago

Trying to drill down on the SVG diff. I had the hunch that pinning the seaborn version is probably a more important aspect than being able to upgrade numpy. And indeed:

python -m pip install numpy==1.23.2 scipy pandas==1.5.3 seaborn==0.10.1 scikit-learn matplotlib==3.4.3

gives an environment that fully reproduces the stats, and the full remaining diff is:

diff --git a/img/hist_saccade_lab.svg b/img/hist_saccade_lab.svg
index 6ef426c..02bb011 100644
--- a/img/hist_saccade_lab.svg
+++ b/img/hist_saccade_lab.svg
@@ -199,16 +199,16 @@ z
    <g id="patch_23">
     <path clip-path="url(#p2f9441ee18)" d="M 157.6125 118.304175 
 L 163.1925 118.304175 
-L 163.1925 117.311604 
-L 157.6125 117.311604 
+L 163.1925 117.460043 
+L 157.6125 117.460043 
 z
 " style="fill:#808080;"/>
    </g>
    <g id="patch_24">
     <path clip-path="url(#p2f9441ee18)" d="M 163.1925 118.304175 
 L 168.7725 118.304175 
-L 168.7725 117.560194 
-L 163.1925 117.560194 
+L 168.7725 117.411756 
+L 163.1925 117.411756 
 z
 " style="fill:#808080;"/>
    </g>

Visually, this is the part of the figure that is different:

image

image

Closeups of the two versions of the figure at the difference (it is the height of the bar in the middle).

image

image

mih commented 11 months ago

Here is the relevant code that is resonsible for this plot:

            fig = plt.figure(figsize=(3,2))
            plt.hist(ev_df['duration'].values,
                    bins='doane',
                    range=x_lim,
                    color='gray')
                    #log=True)
            plt.xlabel('{} duration in s'.format(label))
            plt.xlim(x_lim)
            plt.ylim(y_lim)
            plt.savefig(
                op.join(
                    'img',
                    'hist_{}_{}.svg'.format(
                        label,
                        ds_name)),
                transparent=True,
                bbox_inches="tight",
                metadata={'Date': None})

It is plain matplotlib. We know the matplotlib version that was originally used, it is included in the files RDF metadata:

    <dc:creator>
     <cc:Agent>
      <dc:title>Matplotlib v3.4.3, https://matplotlib.org/</dc:title>
     </cc:Agent>
    </dc:creator>

We have that exact version installed. but this obviously does not mean that we have the exact some binary running. Still weird to have this be the only difference.

mih commented 11 months ago

With this success, I am back in Docker land. Clearly the virtualenv has an impact. So let's try to put a (superfluous) virtualenv inside the docker container.

Known that we can use much more recent software, I am basing on Debian bookworm and use the versions for the previous non-docker exploration:

FROM debian:bookworm-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y make
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev
RUN apt-get install -q --no-install-recommends -y python3-virtualenv python3-wheel
RUN apt-get clean
RUN virtualenv --python="$(which python3)" /env/remodnav-repro
RUN sh -c ". /env/remodnav-repro/bin/activate; python -m pip install numpy==1.23.2 scipy pandas==1.5.3 seaborn==0.10.1 scikit-learn matplotlib==3.4.3 statsmodels"
RUN chmod -R ugo+rw /env/remodnav-repro
RUN rm -rf /root/.local /root/.cache /var/lib/apt/lists/deb.debian.org*
RUN find /env -type d -name __pycache__ -exec rm -rf {} \; -prune
RUN apt-get purge -y build-essential python3-dev
RUN apt-get clean

And indeed! It also arrives at the minimal diff shown in https://github.com/psychoinformatics-de/paper-remodnav/issues/20#issuecomment-1757462683

The image compresses down to 625MB.

mih commented 11 months ago

I can now confirm that the presence or absence of a virtualenv is irrelevant (as it should be). Here is another configuration that achieves the diff from https://github.com/psychoinformatics-de/paper-remodnav/issues/20#issuecomment-1757462683 without any virtualenv:

FROM debian:bookworm-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y make
python3-dev cython3 python3-setuptools python3-wheel
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev
RUN apt-get install -q --no-install-recommends -y python3-virtualenv python3-wheel
RUN apt-get install -q --no-install-recommends -y python3-scipy
RUN apt-get install -q --no-install-recommends -y python3-sklearn
RUN apt-get install -q --no-install-recommends -y python3-statsmodels
RUN apt-get install -q --no-install-recommends -y python3-kiwisolver
RUN apt-get install -q --no-install-recommends -y python3-pyparsing
RUN apt-get install -q --no-install-recommends -y python3-pil
RUN apt-get install -q --no-install-recommends -y python3-pip
RUN apt-get clean
RUN python3 -m pip install --break-system-packages numpy==1.23.2 pandas==1.5.3 seaborn==0.10.1 matplotlib==3.4.3
RUN rm -rf /root/.local /root/.cache /var/lib/apt/lists/deb.debian.org*
RUN apt-get purge -y build-essential
RUN apt-get autoremove -y
RUN apt-get clean
mih commented 11 months ago

Probably the final post in this saga: The trigger for the reproducibility issue is who compiles numpy?

It depends on whether I am using a pip-compiled installation or one downloaded from Debian.

Either of these leads to reproducible results on their own, and that across a wide range of versions. But there is a noticeable difference in results across these means of compiling the sources.

Below is a complete Dockerfile for anyone interested in digging deeper. The key line is the specification of the numpy version. Whenever it is different from the numpy version provided by the respective Debian release (and it does not matter which one), pip will compile it, and it will reproduce the results published many years ago. So change

numpy==1.24.3

a version that is not in Debian bookwork to

numpy==1.24.2

a version that is in Debian bookworm, and the results will not reproduce. Make it 1.24.1 and they will reproduce again, because it will also be compiled locally.

Even when I set up a system like it would have existed at the time of publication (Debian buster), the results do not reproduce, unless pip compiles numpy.

FROM debian:bookworm-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y make
python3-dev cython3 python3-setuptools python3-wheel
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev
RUN apt-get install -q --no-install-recommends -y python3-virtualenv python3-wheel python3-scipy python3-sklearn python3-statsmodels python3-kiwisolver python3-pyparsing python3-pil python3-pip
RUN apt-get clean
RUN python3 -m pip install --break-system-packages numpy==1.24.3 pandas==1.5.3 seaborn==0.10.1 matplotlib==3.4.3
RUN rm -rf /root/.local /root/.cache /var/lib/apt/lists/deb.debian.org*
RUN apt-get purge -y build-essential
RUN apt-get autoremove -y
RUN apt-get clean