Oshlack / ALLSorts

ALLSorts is a B-Cell Acute Lymphoblastic Leukemia (B-ALL) subtype classifier. From gene expression counts to over 18 subtypes.
MIT License
15 stars 8 forks source link

Use Salmon for gene expression measurement estimation instead of STAR #9

Closed ChengHsiangLu closed 1 year ago

ChengHsiangLu commented 2 years ago

Hello, AllSorts is a great tool and I am excited to use it for our B-ALL cases. I have a question about the usage. I would like to use Salmon for gene expression measurement estimation instead of the STAR workflow you show on your GitHub. Would this be ok or do you suggest using STAR only?

Thanks, Sam

breons commented 2 years ago

Hi, thanks for giving ALLSorts a go!

In truth I don't know. The suggested pipeline will ensure that your counts are processed in a similar way to the training data, hopefully limiting any batch effects. Though, if you can ensure that the same gene annotations are used (and are thus have all features/genes available in your final counts), you might find it works well enough.

If any of the subtypes are to be impacted, it's likely the ploidy ones. However, If you're finding multiple subtypes being called per sample (let's say TCF3-PBX1, MEF2D, ZNF384 all in the same sample), it's likely the counts are introducing some effects.

Perhaps for the first few samples, try using both methods and compare.

Let us know how you go :). Breon.

ChengHsiangLu commented 2 years ago

Hi Breon,

Thanks for your reply! I'll try a few samples first and then compare both methods.

Best,

Sam