On this version of anndata: (0.10.5.post1 -c conda-forge), I have been experiencing some obs loosing their values. I checked and it starts at exactly at the location 694000 (with more than 1k cells having zero counts). When I plot the np.sum(adata[693000:695000,:].X, axis=1) 694000 and above are zeros (it regains counts at an unknown location +2or 3k up I believe). This is within a single sample (so not between merged samples). This sample was aprior QC'd with all obs filtered to >1000 counts. I repeated the concat on disk and again same problem using this version of adata.
That one outlier I believe is from converting to one-to-one orthologues and it must have dropped alot of non evolutionarily conserved genes. ....... ahhhh I don't know now.... because it has zero counts.... I'll look into it.
This has been many many edits I shoulda waited to post. But, the below shows no spots with counts below 1,000, yet.... the problem remains, adn is messing up my code downstream. I will continue to debug.
lowest_values = []
for n in range(0, 1046120, 100000):
chunk = adata[n:n+100000, :].X
row_sums = chunk.sum(axis=1)
min_index = row_sums.argmin()
lowest_value = row_sums[min_index]
lowest_values.append(lowest_value)
print(lowest_values)
Can't find that cell. It might be some other problem in the code I think this is good for now. Closing it.
Versions
-----
anndata 0.10.8
numpy 1.26.3
pandas 2.2.0
scanpy 1.9.8
session_info 1.0.0
-----
PIL 10.2.0
anyio NA
array_api_compat 1.4.1
asttokens NA
attr 23.1.0
attrs 23.1.0
babel 2.11.0
brotli 1.1.0
certifi 2024.07.04
cffi 1.16.0
charset_normalizer 3.3.2
colorama 0.4.6
comm 0.1.2
cycler 0.12.1
cython_runtime NA
dateutil 2.8.2
debugpy 1.6.7
decorator 5.1.1
defusedxml 0.7.1
executing 0.8.3
fastjsonschema NA
google NA
h5py 3.10.0
idna 3.6
igraph 0.11.4
ipykernel 6.28.0
jedi 0.18.1
jinja2 3.1.3
joblib 1.3.2
json5 NA
jsonschema 4.19.2
jsonschema_specifications NA
jupyter_events 0.8.0
jupyter_server 2.10.0
jupyterlab_server 2.25.1
kiwisolver 1.4.5
leidenalg 0.10.2
llvmlite 0.41.1
markupsafe 2.1.4
matplotlib 3.8.2
mpl_toolkits NA
natsort 8.4.0
nbformat 5.9.2
numba 0.58.1
overrides NA
packaging 23.2
parso 0.8.3
pexpect 4.9.0
pkg_resources NA
platformdirs 3.11.0
prometheus_client NA
prompt_toolkit 3.0.43
psutil 5.9.8
ptyprocess 0.7.0
pure_eval 0.2.2
pydev_ipython NA
pydevconsole NA
pydevd 2.9.5
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pygments 2.17.2
pyparsing 3.1.1
pythonjsonlogger NA
pytz 2023.4
referencing NA
requests 2.31.0
rfc3339_validator 0.1.4
rfc3986_validator 0.1.1
rpds NA
scipy 1.12.0
send2trash NA
six 1.16.0
sklearn 1.4.0
sniffio 1.3.0
socks 1.7.1
stack_data 0.2.0
texttable 1.7.0
threadpoolctl 3.2.0
torch 2.3.1.post100
torchgen NA
tornado 6.3.3
tqdm 4.66.1
traitlets 5.14.1
typing_extensions NA
urllib3 2.2.0
wcwidth 0.2.13
websocket 1.7.0
yaml 6.0.1
zmq 25.1.2
zoneinfo NA
-----
IPython 8.20.0
jupyter_client 8.6.0
jupyter_core 5.5.0
jupyterlab 4.0.8
-----
Python 3.11.7 | packaged by conda-forge | (main, Dec 23 2023, 14:43:09) [GCC 12.3.0]
Linux-6.1.0-23-cloud-amd64-x86_64-with-glibc2.36
-----
Session information updated at 2024-08-19 02:43
Please make sure these conditions are met
Report
Code:
On this version of anndata: (0.10.5.post1 -c conda-forge), I have been experiencing some obs loosing their values. I checked and it starts at exactly at the location 694000 (with more than 1k cells having zero counts). When I plot the np.sum(adata[693000:695000,:].X, axis=1) 694000 and above are zeros (it regains counts at an unknown location +2or 3k up I believe). This is within a single sample (so not between merged samples). This sample was aprior QC'd with all obs filtered to >1000 counts. I repeated the concat on disk and again same problem using this version of adata.
This problem was mentioned here before, but never followed up on to my understanding. https://discourse.scverse.org/t/counts-in-layers-is-zero-after-ad-concat/1999
I then updated to version 0.10.8, and the problem was fixed! Here's how I confirmed the fix:
That one outlier I believe is from converting to one-to-one orthologues and it must have dropped alot of non evolutionarily conserved genes. ....... ahhhh I don't know now.... because it has zero counts.... I'll look into it.
This has been many many edits I shoulda waited to post. But, the below shows no spots with counts below 1,000, yet.... the problem remains, adn is messing up my code downstream. I will continue to debug.
Can't find that cell. It might be some other problem in the code I think this is good for now. Closing it.
Versions