Closed ChengHsiangLu closed 1 year ago
Hi, thanks for giving ALLSorts a go!
In truth I don't know. The suggested pipeline will ensure that your counts are processed in a similar way to the training data, hopefully limiting any batch effects. Though, if you can ensure that the same gene annotations are used (and are thus have all features/genes available in your final counts), you might find it works well enough.
If any of the subtypes are to be impacted, it's likely the ploidy ones. However, If you're finding multiple subtypes being called per sample (let's say TCF3-PBX1, MEF2D, ZNF384 all in the same sample), it's likely the counts are introducing some effects.
Perhaps for the first few samples, try using both methods and compare.
Let us know how you go :). Breon.
Hi Breon,
Thanks for your reply! I'll try a few samples first and then compare both methods.
Best,
Sam
Hello, AllSorts is a great tool and I am excited to use it for our B-ALL cases. I have a question about the usage. I would like to use Salmon for gene expression measurement estimation instead of the STAR workflow you show on your GitHub. Would this be ok or do you suggest using STAR only?
Thanks, Sam