sanger-pathogens / Bio-Tradis

A set of tools to analyse the output from TraDIS analyses
https://sanger-pathogens.github.io/Bio-Tradis/
Other
23 stars 29 forks source link

Are the insert_site_plots normalised? #138

Open CSmith-bug opened 2 months ago

CSmith-bug commented 2 months ago

Apologies, not so much an issue but rather a question.

I was wondering if the insert_site_plots.gz are normalised based on total read counts?

If not, is it possible to normalise this so that when viewing the plots on Artemis a more direct comparison between control and condition plots can made visually?

I appreciate any advice!

Thanks, Chris

lbarquist commented 2 months ago

Hi Chris,

No, the plots just give the raw counts per position so they can be used for downstream calculations.

If you were so inclined, you could just divide the counts at each position by the total read count to get what you're looking for. I'm not sure this will necessarily make things comparable between condition and control, depends on the distributions of fitness effects and insertion site abundances, but might be ok. Usually I just right-click on the plot and play with the scaling or min/max options, but honestly I mostly find the plot files useful just to see if there's anything weird going on, or looking at either essential genes/regions (where it's totally dropped out) or really extreme differences.

For plotting differences between conditions, you can look at some of the plots we've made here (https://journals.asm.org/doi/full/10.1128/msystems.00665-23) -- these are done in R with the output of tradis_gene_insert_sites and edgeR, might give you some ideas.

-Lars