Closed BrianLohman closed 4 years ago
Hi Brian,
so what we do mostly lately is labelling following some perturbation (protein degradation or drug treatment). So for those analysis it would be mostly DE-analysis downstream where we feed in the T>C read counts.
Now in Slamdunk, you have the option to only count reads as T>C reads with >= 2 T>C conversions. In the Science paper we showed that this very effectively reduces noise from the assay.
Is that an option for you?
Hi Tobias,
Thank you for your quick response. I will try your suggestion by adding:
-mts 2
to my slamdunk all command. I will let you know how this goes.
The sample I showed above is unlabeled (just a regular RNAseq library from a non-SlamSeq project) so I was concerned about background T>C conversion.
Cheers,
Brian
EDIT:
adding -mts 2
returns:
slamdunk all: error: argument -t/--threads: invalid int value: 's'
and using the full param name --multiTCStringency
returns:
slamdunk: error: unrecognized arguments: --multiTCStringency 2
The full command is:
slamdunk all \
./17699X5_R1_R2.fq \
-n 100 \
-t 12 \
-m \
-5 12 \
-rl 150 \
-mts 2 \
-o ./17699X5 \
-b ../../Mus_musculus.GRCm38.98.3primeUTR.bed \
-r ../../Mus_musculus.GRCm38.dna.primary_assembly.fa
Hi Brian,
sorry this is a fault on my end for being lazy on updating the documentation. The -mts
parameter was replaced by -c
where you can specify the number of T>C conversions needed in a read to be counted as TCRead. So the equivalent of -mts
would be now -c 2
.
Hi Tobias,
As always, thanks for the fast reply. I added -c 2
to my call to slamdunk all
and the "false positive" (genes in unlabeled samples) rate dropped to about 1%. I think this should do for now.
Thank you!
Cheers,
Brian
Sure thing!
Hi Tobias,
I have looked through the issues and a few people have asked about background levels, but for very specific cases. I'd like to ask more generally, what kind of background TC counts do you expect? Is there a suggested way to deal with this background noise? For example, in #52, you suggested subtracting the TC counts from the control sample.
I have run samples from other projects through Slamdunk and I see about 20% of the genes with TC counts > 0. TC counts in these control samples can be as high as 2,700. I would have expected to get very close to 0 for all genes.
Some basic stats for an example sample are below:
Thank you for your help,
Brian