atarashansky / SAMap

SAMap: Mapping single-cell RNA sequencing datasets from evolutionarily distant organisms.
MIT License
63 stars 19 forks source link

Gene filtering parameter #139

Closed dariotommasini closed 6 months ago

dariotommasini commented 6 months ago

Hi @atarashansky ,

I noticed in the paper that "genes expressed in greater than 96% of cells are filtered out." Is there a way to modify this?

Thanks! Dario

atarashansky commented 6 months ago

This is a preprocessing parameter hardcoded to 0.96 and passed into SAM.preprocess_data. See here: https://github.com/atarashansky/SAMap/blob/f2f1a5b4324f81dc9b7c65f98b980d447f66b629/samap/mapping.py#L115

You can either clone the repo to your machine, change the 0.96 to something else, and pip install . in the root directory of the repo, or you can run SAM yourself and then pass the SAM objects into SAMAP directly.

dariotommasini commented 6 months ago

Thanks, this is exactly what I needed. I'll run SAM beforehand. It would just look like this, right?

sam.preprocess_data(thresh_high = 0.97)
sam.run()
atarashansky commented 6 months ago

Yeah, that's right!