Closed mih closed 11 months ago
reproduced the same diff as you with the python3.7 Docker image
For comparison: trying with a virtualenv, trying to go with whatever latest version that is still API compatible with the code.
$ virtualenv --python="$(which python3)" ${HOME}/env/remodnav-repro
$ . ~/env/remodnav-repro/bin/activate
$ python -m pip install numpy scipy pandas==1.5.3 seaborn scikit-learn matplotlib==3.4.3
The previously pinned seaborn 0.10.1 is incompatible with numpy 1.26, and had to be unpinned.
...
File "/home/mih/env/remodnav-repro/lib/python3.11/site-packages/numpy/__init__.py", line 324, in __getattr__
raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'bool'.
`np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'bool_'?
Pandas had to be pinned to the last 1.x release. Pandas 2.1.1 incompatibility:
ValueError: Multi-dimensional indexing (e.g. `obj[:, None]`) is no longer supported. Convert to a numpy array before indexing instead.
pip freeze
gives
cycler==0.12.1
joblib==1.3.2
kiwisolver==1.4.5
matplotlib==3.4.3
numpy==1.26.0
pandas==1.5.3
Pillow==10.0.1
pyparsing==3.1.1
python-dateutil==2.8.2
pytz==2023.3.post1
scikit-learn==1.3.1
scipy==1.11.3
seaborn==0.13.0
six==1.16.0
threadpoolctl==3.2.0
This REPRODUCES all stats exactly!!!
The remaining diff is in the SVGs
img/confusion_MN_AL.svg | 24 ++++++++++++------------
img/confusion_MN_RA.svg | 24 ++++++++++++------------
img/confusion_RA_AL.svg | 24 ++++++++++++------------
img/hist_saccade_lab.svg | 8 ++++----
Trying to drill down on the SVG diff. I had the hunch that pinning the seaborn version is probably a more important aspect than being able to upgrade numpy. And indeed:
python -m pip install numpy==1.23.2 scipy pandas==1.5.3 seaborn==0.10.1 scikit-learn matplotlib==3.4.3
gives an environment that fully reproduces the stats, and the full remaining diff is:
diff --git a/img/hist_saccade_lab.svg b/img/hist_saccade_lab.svg
index 6ef426c..02bb011 100644
--- a/img/hist_saccade_lab.svg
+++ b/img/hist_saccade_lab.svg
@@ -199,16 +199,16 @@ z
<g id="patch_23">
<path clip-path="url(#p2f9441ee18)" d="M 157.6125 118.304175
L 163.1925 118.304175
-L 163.1925 117.311604
-L 157.6125 117.311604
+L 163.1925 117.460043
+L 157.6125 117.460043
z
" style="fill:#808080;"/>
</g>
<g id="patch_24">
<path clip-path="url(#p2f9441ee18)" d="M 163.1925 118.304175
L 168.7725 118.304175
-L 168.7725 117.560194
-L 163.1925 117.560194
+L 168.7725 117.411756
+L 163.1925 117.411756
z
" style="fill:#808080;"/>
</g>
Visually, this is the part of the figure that is different:
Closeups of the two versions of the figure at the difference (it is the height of the bar in the middle).
Here is the relevant code that is resonsible for this plot:
fig = plt.figure(figsize=(3,2))
plt.hist(ev_df['duration'].values,
bins='doane',
range=x_lim,
color='gray')
#log=True)
plt.xlabel('{} duration in s'.format(label))
plt.xlim(x_lim)
plt.ylim(y_lim)
plt.savefig(
op.join(
'img',
'hist_{}_{}.svg'.format(
label,
ds_name)),
transparent=True,
bbox_inches="tight",
metadata={'Date': None})
It is plain matplotlib. We know the matplotlib version that was originally used, it is included in the files RDF metadata:
<dc:creator>
<cc:Agent>
<dc:title>Matplotlib v3.4.3, https://matplotlib.org/</dc:title>
</cc:Agent>
</dc:creator>
We have that exact version installed. but this obviously does not mean that we have the exact some binary running. Still weird to have this be the only difference.
With this success, I am back in Docker land. Clearly the virtualenv has an impact. So let's try to put a (superfluous) virtualenv inside the docker container.
Known that we can use much more recent software, I am basing on Debian bookworm and use the versions for the previous non-docker exploration:
FROM debian:bookworm-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y make
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev
RUN apt-get install -q --no-install-recommends -y python3-virtualenv python3-wheel
RUN apt-get clean
RUN virtualenv --python="$(which python3)" /env/remodnav-repro
RUN sh -c ". /env/remodnav-repro/bin/activate; python -m pip install numpy==1.23.2 scipy pandas==1.5.3 seaborn==0.10.1 scikit-learn matplotlib==3.4.3 statsmodels"
RUN chmod -R ugo+rw /env/remodnav-repro
RUN rm -rf /root/.local /root/.cache /var/lib/apt/lists/deb.debian.org*
RUN find /env -type d -name __pycache__ -exec rm -rf {} \; -prune
RUN apt-get purge -y build-essential python3-dev
RUN apt-get clean
And indeed! It also arrives at the minimal diff shown in https://github.com/psychoinformatics-de/paper-remodnav/issues/20#issuecomment-1757462683
The image compresses down to 625MB.
I can now confirm that the presence or absence of a virtualenv is irrelevant (as it should be). Here is another configuration that achieves the diff from https://github.com/psychoinformatics-de/paper-remodnav/issues/20#issuecomment-1757462683 without any virtualenv:
FROM debian:bookworm-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y make
python3-dev cython3 python3-setuptools python3-wheel
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev
RUN apt-get install -q --no-install-recommends -y python3-virtualenv python3-wheel
RUN apt-get install -q --no-install-recommends -y python3-scipy
RUN apt-get install -q --no-install-recommends -y python3-sklearn
RUN apt-get install -q --no-install-recommends -y python3-statsmodels
RUN apt-get install -q --no-install-recommends -y python3-kiwisolver
RUN apt-get install -q --no-install-recommends -y python3-pyparsing
RUN apt-get install -q --no-install-recommends -y python3-pil
RUN apt-get install -q --no-install-recommends -y python3-pip
RUN apt-get clean
RUN python3 -m pip install --break-system-packages numpy==1.23.2 pandas==1.5.3 seaborn==0.10.1 matplotlib==3.4.3
RUN rm -rf /root/.local /root/.cache /var/lib/apt/lists/deb.debian.org*
RUN apt-get purge -y build-essential
RUN apt-get autoremove -y
RUN apt-get clean
Probably the final post in this saga: The trigger for the reproducibility issue is who compiles numpy?
It depends on whether I am using a pip-compiled installation or one downloaded from Debian.
Either of these leads to reproducible results on their own, and that across a wide range of versions. But there is a noticeable difference in results across these means of compiling the sources.
Below is a complete Dockerfile for anyone interested in digging deeper. The key line is the specification of the numpy version. Whenever it is different from the numpy version provided by the respective Debian release (and it does not matter which one), pip will compile it, and it will reproduce the results published many years ago. So change
numpy==1.24.3
a version that is not in Debian bookwork to
numpy==1.24.2
a version that is in Debian bookworm, and the results will not reproduce. Make it 1.24.1
and they will reproduce again, because it will also be compiled locally.
Even when I set up a system like it would have existed at the time of publication (Debian buster), the results do not reproduce, unless pip compiles numpy.
FROM debian:bookworm-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq -y --allow-releaseinfo-change
RUN apt-get install -q --no-install-recommends -y make
python3-dev cython3 python3-setuptools python3-wheel
RUN apt-get install -q --no-install-recommends -y build-essential python3-dev
RUN apt-get install -q --no-install-recommends -y python3-virtualenv python3-wheel python3-scipy python3-sklearn python3-statsmodels python3-kiwisolver python3-pyparsing python3-pil python3-pip
RUN apt-get clean
RUN python3 -m pip install --break-system-packages numpy==1.24.3 pandas==1.5.3 seaborn==0.10.1 matplotlib==3.4.3
RUN rm -rf /root/.local /root/.cache /var/lib/apt/lists/deb.debian.org*
RUN apt-get purge -y build-essential
RUN apt-get autoremove -y
RUN apt-get clean
TL;DR Versions did not matter, time did not matter, but it matters who compiles numpy
Five years after we did this analysis, I am trying to compile a docker-based environment. I can success building the stats and figures in a wide variety of configurations. However, there are small differences.
I am collecting some notes here, trying to narrow down on a setup the reproduces the stats exeactly:
Debian buster PY3.7
pip freeze
givesafter running the analysis, the following diff occurs
Debian bullseye PY3.9
pip freeze
givesThe diff of the statistical scores is identical compared to the bullseye container. Also the same SVG are modified (also looks identical inside).
Ubuntu focal PY3.8
pip freeze
givesThe diff of the statistical scores is identical compared to the bullseye and buster containers. Also the same SVG are modified (also looks identical inside).
Conclusions