Closed joeyhuang401055 closed 2 years ago
BTW, my queston is similar to #61 and I understood that there's no need to run sam.preprocss_data() when loading Seurat-integrated datasets.
Hi Joey - this is still experimental and something I haven't tested thoroughly, but the latest version of SAM offers native batch correction using Harmony. There's a new parameter to sam.run
called batch_key
- set it to the sam.adata.obs
column which contains your batch variable.
So your flow should be:
1) Concatenate all your (normalized and log-transformed) data into one AnnData (let's say it's called adata
).
2) Load it into SAM: sam=SAM(counts=adata)
3) Run SAM: sam.run(batch_key="batch")
4) Let's say you did this for both species, sam1
and sam2
- input those into SAMAP:
e.g. for human and mouse with species identifiers hu
and mo
:
sams={'hu': sam1, 'mo': sam2}
sm = SAMAP(sams, ...other_args)
Hi Mr. Tarashansky,
Thank you so much for your quick reply! The native batch correction was pretty helpful!
Tzu-Yi Huang (Joey)
I am running the current version of SAMap (v1.0.15). I am trying to use the batch option with the command sm.run(batch_key="batch")
but I get the error:
TypeError: run() got an unexpected keyword argument 'batch_key'
. Has this option been removed from the current version?
Hi Mr. Tarashansky,
Thanks for developing SAMap. This is indeed a useful and surely impactful tool for studying Evo-Devo with scRNA-seq datasets. I am wondering how to apply SAMap to datasets with biological replicates. The data I am using contain 6 samples (2 technical replicates and 4 biological replicates) and therefore 6 matrices. In your tutorial, it's suggested to input the raw matrix into SAM and SAMap, but I noticed that directly inputting the 6 matrices would result in significant batch effect. So I would like to ask your opinions on the following:
Thanks in advance. I look forward to hearing your thoughts.
Tzu-Yi Huang