single-cell-data / SOMA

A flexible and extensible API for annotated 2D matrix data stored in multiple underlying formats.
MIT License
72 stars 9 forks source link

Support Python 3.12 #186

Closed johnkerl closed 7 months ago

johnkerl commented 7 months ago

https://github.com/single-cell-data/TileDB-SOMA/issues/1849

Note: I ran python-spec/update-requirements-txt

bkmartinjr commented 7 months ago

Question: if you ran `update-requirements-txt, with the mods to include python 3.11 and 3.12, shouldn't there be two additional files included in this PR? (requirements-py3.11.txt and requirements-py3.12.txt)?

johnkerl commented 7 months ago

Question: if you ran `update-requirements-txt, with the mods to include python 3.11 and 3.12, shouldn't there be two additional files included in this PR? (requirements-py3.11.txt and requirements-py3.12.txt)?

Yes and I watched myself typing git add of them ... 🤔

johnkerl commented 7 months ago

Also, is it worth updating the python-somacore.yaml to include python 3.11 and 3.12?

Thanks for seeing what I did not!

thetorpedodog commented 7 months ago

This is way bigger of a change than what I would expect. Was this run on a Linux machine or Mac OS? Running it right now on Linux I get the diff in change 7e3adfe95cd032ce644e15a96f0daed1005b7151.

johnkerl commented 7 months ago

@thetorpedodog I ran it on an Ubuntu EC2 instance.

I'm happy to commit whatever. Just trying to do what I thought was right.

johnkerl commented 7 months ago

@thetorpedodog https://github.com/single-cell-data/SOMA/pull/186/commits/4551a830274f8fc0dc3bb80397553ddb1850d858

thetorpedodog commented 7 months ago

I'm happy to commit whatever. Just trying to do what I thought was right.

I am puzzled. I’m sure you did everything right, and everything else looks normal, but the reason that Python 3.10 (and only Python 3.10???) has (a) so many additional packages and (b) a bunch of downgrades is beyond me. The other thing I can think of: is this an AMD64 machine, or AArch64? (But even then, why would it be so different??????)

johnkerl commented 7 months ago

@thetorpedodog I don't know the right thing to do to merge this PR.

johnkerl commented 7 months ago

@thetorpedodog I applied your diff in @thetorpedodog https://github.com/single-cell-data/SOMA/pull/186/commits/4551a830274f8fc0dc3bb80397553ddb1850d858

Does that unblock us?

johnkerl commented 7 months ago

I am puzzled. I’m sure you did everything right, and everything else looks normal, but the reason that Python 3.10 (and only Python 3.10???) has (a) so many additional packages and (b) a bunch of downgrades is beyond me. The other thing I can think of: is this an AMD64 machine, or AArch64? (But even then, why would it be so different??????)

@thetorpedodog here are the specs:

ubuntu@wihti[prod][][~]$ python --version
Python 3.10.12

ubuntu@wihti[prod][][~]$ uname -a
Linux ip-172-31-85-136 6.2.0-1018-aws #18~22.04.1-Ubuntu SMP Wed Jan 10 22:54:16 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

ubuntu@wihti[prod][][~]$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:    22.04
Codename:   jammy

That machine has a 3.10 already, and tiledbsoma already, before my running your script.

Should I add a comment that python-spec/update-requirements-txt should be run from a clean Docker image perhaps?

thetorpedodog commented 7 months ago

That machine has a 3.10 already, and tiledbsoma already, before my running your script.

Should I add a comment that python-spec/update-requirements-txt should be run from a clean Docker image perhaps?

It shouldn’t need to, at least as far as I am aware, because I thought Conda provided enough isolation. I’ll see what happens when running in a couple different Dockers.

johnkerl commented 7 months ago

Thanks @thetorpedodog but I do know your time is very valuable. Debugging this is not high-pri for me. Just to be very clear what I'm not asking of your time.

thetorpedodog commented 7 months ago

I made this little script and put it at playground/run-in-container.sh:

#!/bin/sh

apt-get update
apt-get -y install python-is-python3 python3-pip git wget
apt-get -y upgrade
cd /tmp
wget -O miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x ./miniconda.sh
./miniconda.sh -b
eval "$(/root/miniconda3/bin/conda shell.bash hook 2>/dev/null)"
git clone -b original-py3.12-branch https://github.com/single-cell-data/SOMA.git
cd SOMA
./python-spec/update-requirements-txt
echo "####################################################################"
echo
echo
git --no-pager diff
echo
echo
echo "####################################################################"

original-py3.12-branch refers to commit 67e9c40ba2f32fbe18243ee24749e9ce2832928f

and ran it like:

$ export DISTRO=ubuntu:22.04
$ docker run -it --rm --mount type=bind,source="$(pwd)/playground/run-in-container.sh",target=/opt/do-stuff,readonly "$DISTRO" /opt/do-stuff

and the diff it output looked like this, on both ubuntu:22.04 and debian:12 (there may have been some slight differences in what versions were installed in the end but the list of + and − packages was the same):

diff --git a/python-spec/requirements-py3.10.txt b/python-spec/requirements-py3.10.txt
index 52fb7e9..ec1f979 100644
--- a/python-spec/requirements-py3.10.txt
+++ b/python-spec/requirements-py3.10.txt
@@ -1,37 +1,19 @@
-anndata==0.9.1
-attrs==23.1.0
-cloudpickle==2.2.1
-contourpy==1.1.0
-cycler==0.11.0
-fonttools==4.40.0
-h5py==3.9.0
-joblib==1.2.0
-kiwisolver==1.4.4
-llvmlite==0.39.1
-matplotlib==3.7.1
+anndata==0.10.5.post1
+array_api_compat==1.4.1
+attrs==23.2.0
+exceptiongroup==1.2.0
+h5py==3.10.0
+llvmlite==0.42.0
 natsort==8.4.0
-networkx==3.1
-numba==0.56.4
-numpy==1.23.5
-packaging==23.1
-pandas==1.5.3
-patsy==0.5.3
-Pillow==9.5.0
-pyarrow==12.0.1
+numba==0.59.0
+numpy==1.26.4
+packaging==23.2
+pandas==2.2.0
+pyarrow==15.0.0
 pyarrow-hotfix==0.6
-pynndescent==0.5.10
 python-dateutil==2.8.2
 pytz==2024.1
-scanpy==1.9.3
-scikit-learn==1.2.2
-scipy==1.10.1
-seaborn==0.12.2
-session-info==1.0.0
+scipy==1.12.0
 six==1.16.0
-statsmodels==0.14.0
-stdlib-list==0.9.0
-tblib==1.7.0
-threadpoolctl==3.1.0
-tqdm==4.65.0
-typing_extensions==4.6.3
-umap-learn==0.5.3
+typing_extensions==4.9.0
+tzdata==2024.1
diff --git a/python-spec/requirements-py3.12.txt b/python-spec/requirements-py3.12.txt
index b23699f..2bb7667 100644
--- a/python-spec/requirements-py3.12.txt
+++ b/python-spec/requirements-py3.12.txt
@@ -13,8 +13,8 @@ pyarrow-hotfix==0.6
 python-dateutil==2.8.2
 pytz==2024.1
 scipy==1.12.0
-setuptools==69.0.3
+setuptools==68.2.2
 six==1.16.0
 typing_extensions==4.9.0
 tzdata==2024.1
-wheel==0.42.0
+wheel==0.41.2
diff --git a/python-spec/requirements-py3.7.txt b/python-spec/requirements-py3.7.txt
index fcdd4b3..7980211 100644
--- a/python-spec/requirements-py3.7.txt
+++ b/python-spec/requirements-py3.7.txt
@@ -1,5 +1,6 @@
 anndata==0.8.0
 attrs==23.2.0
+certifi @ file:///croot/certifi_1671487769961/work/certifi
 h5py==3.8.0
 importlib-metadata==6.7.0
 llvmlite==0.39.1

…which is to say that, even within an empty container, I couldn’t get the same package combinations you were getting and the end result is closer to what I had gotten by running it on my own regular machine. Does the result end up looking any different if you run it in a container on yours?

johnkerl commented 7 months ago

@thetorpedodog full log: https://gist.github.com/johnkerl/2dd8f2a19db99c601865e42e242d29b9

Just the diff part: https://gist.github.com/johnkerl/39c1829377265bf6dc41f691167bea24