AttributeError: 'AnnData' object has no attribute 'collator_fn'

JesusGF1 commented 1 month ago

Hello guys, the package is great and makes it super simple to deploy foundational models. I have been running into this issue trying to run UCE. Any idea on how to fix this?

AttributeError Traceback (most recent call last) Cell In[30], line 2 1 train_data = uce.process_data(adata) ----> 2 ref_embeddings = uce.get_embeddings(adata)

File ~/opt/miniconda3/envs/helical-package/lib/python3.11/site-packages/helical/models/uce/model.py:175, in UCE.get_embeddings(self, dataset) 158 """Gets the gene embeddings from the UCE model 159 160 Parameters (...) 168 The gene embeddings in the form of a numpy array 169 """ 171 batch_size = self.config["batch_size"] 172 dataloader = DataLoader(dataset, 173 batch_size=batch_size, 174 shuffle=False, --> 175 collate_fn=dataset.collator_fn, 176 num_workers=0) 179 if self.accelerator is not None: 180 dataloader = self.accelerator.prepare(dataloader)

The code I am using follows the tutorial adata = ad.read_h5ad("path_to_file") device = "cuda" if torch.cuda.is_available() else "cpu" model_config = UCEConfig(batch_size=5) uce = UCE(configurer=model_config) train_data = uce.process_data(adata) ref_embeddings = uce.get_embeddings(adata)

maxiallard commented 1 month ago

Thanks for raising this! We'll look into it asap :)

JadSbai commented 1 month ago

Hi @JesusGF1,

Thank you for raising this issue. I've reviewed your code, and I believe I've identified the problem.

The error occurs because you're passing the adata object to the get_embeddings function instead of the train_data object. Here's why this matters:

The train_data object is a UCEDataset instance, which has all the necessary attributes and functions required by the model, including the collator_fn.
The adata object is an unprocessed AnnData object, which doesn't have these required attributes, hence the AttributeError you're encountering.

To resolve this, please modify your code as follows:

train_data = uce.process_data(adata)
ref_embeddings = uce.get_embeddings(train_data)  # Use train_data instead of adata

This change should resolve the AttributeError you're experiencing.

If you continue to face any issues or have any questions, please don't hesitate to ask. Here to help!

maxiallard commented 1 month ago

@JesusGF1 Closing for now, if it's not resolved let us knowe and we can re-open it!

JesusGF1 commented 2 weeks ago

Thank you, it solved my error.

helicalAI / helical

AttributeError: 'AnnData' object has no attribute 'collator_fn' #87