Hello,
I am facing a problem with loading and passing data through a trained model when the data contains categories that were unseen by the model at training time. Specifically, I am training on certain tissues and want to use the model's prediction on other tissues. The data for these unseen tissues are stored in a separate file from the training data.
INFO Received view of anndata, making copy.
INFO Input AnnData not setup with scvi-tools. attempting to transfer AnnData setup
And ends on:
ValueError: Category XXXX not found in source registry. Cannot transfer setup without extend_categories = True.
Where XXXX is a tissue that was absent from the training file.
What would be the correct way to do this? I cannot find any way to pass the extend_categories kwarg.
What I tried
After digging into the source code I imagine this would involve something like:
model.register_manager(model.adata_manager.transfer_fields(adata_target=test_adata, extend_categories=True))
But I cannot find how to make the model use this new manager.
For now, a workaround is to set the categories in the test data to a category that was present in the training data.For example, setting the tissue column in the test data to the first tissue in the registry of the model:
Hello, I am facing a problem with loading and passing data through a trained model when the data contains categories that were unseen by the model at training time. Specifically, I am training on certain tissues and want to use the model's prediction on other tissues. The data for these unseen tissues are stored in a separate file from the training data.
The code to do this would look like:
This outputs:
And ends on:
Where XXXX is a tissue that was absent from the training file.
What would be the correct way to do this? I cannot find any way to pass the extend_categories kwarg.
What I tried
After digging into the source code I imagine this would involve something like:
model.register_manager(model.adata_manager.transfer_fields(adata_target=test_adata, extend_categories=True))
But I cannot find how to make the model use this new manager.For now, a workaround is to set the categories in the test data to a category that was present in the training data.For example, setting the tissue column in the test data to the first tissue in the registry of the model:
However this is quite an unsatisfactory solution and there is certainly a cleaner way of doing this.
Thank you!