atarashansky / SAMap

SAMap: Mapping single-cell RNA sequencing datasets from evolutionarily distant organisms.
MIT License
66 stars 19 forks source link

Gene filtering parameter #139

Closed dariotommasini closed 10 months ago

dariotommasini commented 10 months ago

Hi @atarashansky ,

I noticed in the paper that "genes expressed in greater than 96% of cells are filtered out." Is there a way to modify this?

Thanks! Dario

atarashansky commented 10 months ago

This is a preprocessing parameter hardcoded to 0.96 and passed into SAM.preprocess_data. See here: https://github.com/atarashansky/SAMap/blob/f2f1a5b4324f81dc9b7c65f98b980d447f66b629/samap/mapping.py#L115

You can either clone the repo to your machine, change the 0.96 to something else, and pip install . in the root directory of the repo, or you can run SAM yourself and then pass the SAM objects into SAMAP directly.

dariotommasini commented 10 months ago

Thanks, this is exactly what I needed. I'll run SAM beforehand. It would just look like this, right?

sam.preprocess_data(thresh_high = 0.97)
sam.run()
atarashansky commented 10 months ago

Yeah, that's right!