I am trying to construct an integrated dataset from 52 samples across 3 different studies. The cell types are already known. My goal is to use this integrated dataset as a reference for anchor transfer workflow.
I ran into memory issues during SCT integration, so I decided to integrate the different studies separately, then split the integrated objects by cell types, then integrate the split objects, keeping cell type separate. Lastly, I merged the object to form my final integrated reference.
However, this leads to my final integrated object having 7 SCT Models stored in the merged object. If I set recompute.residuals = TRUE when finding transfer anchors, it doesnt know which model to use.
So my question is: how does IntegrateData merge the SCT model lists? How can I do this to my merged object so that I can recompute the residuals?
I see that in the source code the sct model that is first in the sample tree integration is used as the reference sct model. So basically it chooses the model of the dataset with the most anchors?
Hi,
I am trying to construct an integrated dataset from 52 samples across 3 different studies. The cell types are already known. My goal is to use this integrated dataset as a reference for anchor transfer workflow.
I ran into memory issues during SCT integration, so I decided to integrate the different studies separately, then split the integrated objects by cell types, then integrate the split objects, keeping cell type separate. Lastly, I merged the object to form my final integrated reference.
However, this leads to my final integrated object having 7 SCT Models stored in the merged object. If I set recompute.residuals = TRUE when finding transfer anchors, it doesnt know which model to use.
So my question is: how does IntegrateData merge the SCT model lists? How can I do this to my merged object so that I can recompute the residuals?
I see that in the source code the sct model that is first in the sample tree integration is used as the reference sct model. So basically it chooses the model of the dataset with the most anchors?