Closed ccruizm closed 8 months ago
I am also running into similar issues where the imputation is only NaN values. This seems to happen somewhere along the training where the loss is a real number and then becomes NaN after some number of steps (see training log below):
I am running:
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"
# run ENVI
ENVI_Model = ENVI.ENVI(spatial_data = spatial_data, sc_data = sc_data)
ENVI_Model.Train()
ENVI_Model.impute()
# get imputation
imputed = ENVI_Model.spatial_data.obsm['imputation']
And this is a snippet of the training log, which completes successfully but returns imputation of only NaNs:
Trn: spatial Loss: -8.73282, SC Loss: -0.54686, Cov Loss: -0.01238, KL Loss: 0.91851: 0%| | 63/16384 [00:20<1:15:04, 3.62it/s]
Trn: spatial Loss: -8.73282, SC Loss: -0.54686, Cov Loss: -0.01238, KL Loss: 0.91851: 0%| | 64/16384 [00:21<1:13:31, 3.70it/s]
Trn: spatial Loss: nan, SC Loss: nan, Cov Loss: nan, KL Loss: nan: 0%| | 64/16384 [00:21<1:13:31, 3.70it/s]
Trn: spatial Loss: nan, SC Loss: nan, Cov Loss: nan, KL Loss: nan: 0%| | 65/16384 [00:21<1:22:38, 3.29it/s]
Trn: spatial Loss: nan, SC Loss: nan, Cov Loss: nan, KL Loss: nan: 0%| | 66/16384 [00:21<1:20:49, 3.37it/s]
The inputs are the same type as in the tutorial, where spatial_data.X
and sc_data.X
are both dense float32 numpy arrays. It would be useful to get some insight into why this failure is occurring and if additional preprocessing of the data might be necessary to get ENVI to run on the inputs.
I am also running into similar issues with getting ENVI to run on GPU using only the setup specified in the tutorial examples.
I had the same problem when I tried it to run the tutorial on my MBP with M1 Pro chip. Though when I ran the tutorial on a cluster, it did work. Not sure what architecture you're dealing with?
I had the same problem described here. All losses just appear to be nan right from the beginning. I was able to run the tutorial successfully, but when I switched to my own data, it did not work. I passed on raw counts to the model for both spatial_data and sc_data. Would greatly appreciate it if you have any insight in why might be causing this. Thank you!
Hi everyone!
We just released an updated version of ENVI based on JAX/FLAX instead of Tensorflow and we took care of the stability issues!
Please try again with the new version!
Good day!
I want to test your tool in my own dataset. It is not clear what the input for the pipeline should be (e.g., raw vs normalized counts for sc and st). Based on the datasets where you tested the pipeline, we require to start with raw counts stored in a dense matrix. I think I have formatted my data to the required prerequisites for ENVI but when I run
ENVI_Model.Train
I get the following outputThe 'envi_latent' contains only nan. What am I doing wrong? I could share a subset of the dataset to figure out where the issues reside.
Another issue I have is setting up the tool to use GPU (A100). I see that my device is available:
I set
But still, ENVI does not recognize the GPU. Do you have any advice on how to fix this?
Thanks in advance!