Open chuckwondo opened 4 months ago
@jsignell, when you're ready to dig into this, let me know. We can jump on a quick call so I can quickly orient you so you don't have to spend much time trying to decipher things on your own.
Thanks for the ping. I will probably not get to this until a week or two from now.
Thanks for the ping. I will probably not get to this until a week or two from now.
Cool. No rush.
There are three possible explanations as far as I can tell: 1) the resolved environments are different enough that the results end up being substantially different 2) the results are different, but not meaningfully so (they are similar within a certain tolerance threshold) 3) the inputs are getting passed in differently (meaning the libraries themselves might not differ, just the input mechanism)
To try to get a better sense of what is going on I created the initial environments using pip+docker for ESA and mamba for NASA. The resolved environments appear fairly similar. The most interesting difference is the nvidia packages and Pillow but I haven't dug into whether or not those are being used.
Package | NASA - conda | ESA - pip+docker |
---|---|---|
Brotli | 1.1.0 | --- |
certifi | 2024.6.2 | 2024.6.2 |
charset-normalizer | 3.3.2 | 3.3.2 |
cloudpickle | 3.0.0 | 3.0.0 |
Cython | 3.0.10 | --- |
GDAL | 3.8.5 | 3.8.5 |
idna | 3.7 | 3.7 |
Jinja2 | 3.1.4 | 3.1.4 |
markdown-it-py | 3.0.0 | 3.0.0 |
MarkupSafe | 2.1.5 | 2.1.5 |
mdurl | 0.1.2 | 0.1.2 |
numpy | 1.26.4 | 1.26.4 |
nvidia-ml-py | --- | 12.555.43 |
nvidia-ml-py3 | --- | 7.352.0 |
Pillow | --- | 9.0.1 |
pip | 24.0 | 22.0.2 |
psutil | 5.9.8 | 5.9.8 |
Pygments | 2.18.0 | 2.18.0 |
pynvml | 11.4.1 | --- |
PySocks | 1.7.1 | --- |
requests | 2.32.3 | 2.32.3 |
rich | 13.7.1 | 13.7.1 |
sardem | 0.11.3 | 0.11.3 |
scalene | 1.5.38 | 1.5.42 |
setuptools | 70.0.0 | 59.6.0 |
typing_extensions | 4.12.2 | --- |
urllib3 | 2.2.1 | 2.2.2 |
wheel | 0.43.0 | 0.37.1 |
Next I loaded the tifs in numpy and ran allclose
. They are indeed substantially different:
import numpy as np
from osgeo import gdal
esa = np.array(gdal.Open('./output/esa/dem.tif').ReadAsArray())
nasa = np.array(gdal.Open('./output/dem.tif').ReadAsArray())
np.allclose(esa, nasa) # False
I am still working on trying to pare down the run scripts to see if the inputs are getting passed in differently or something.
The solution for #7 will be to unify dependency management between NASA and ESA, but we still want to know precisely which dependency(ies) caused the difference in outputs so we have a proper understanding of what specifically would cause such an issue, whether it be some change in underlying floating-point handling/precision, or otherwise.