Are pA values from slow5tools normalized?

chilampoon commented 2 years ago

Hi there, I am looking at the normalization methods of ont raw signals. I am not clear about whether the picoamp values gotten from slow5tools seq_reads(pA=True) are normalized already, or not? I've also tried the normalization in tombo using their function tombo_stats.normalize_raw_signal, where they scale the values of raw signals to ~0.

Seems like using either pA values or those normalized raw signal values didn't affect too much for my downstream analysis, but I am curious if I'd like to normalize the squiggles of my dataset globally, which method you'll suggest? Thanks.

Psy-Fer commented 2 years ago

Hello,

pA conversion is not the same as normalisation.

pA conversion is handled by the following

Scale = range / digitisation
pA = scale * (raw_signal + offset)

This gives you a positive float value, like that you get from any of the pA conversions in any of the slow5 associated tools.

Normalisation, is usually of 2 kinds. In the early days, it was z-normalisation, but now most tools use median-median absolute difference (med-mad).

You can see the supp plots over on the SquiggleKit paper to see the impact on how it impacts down stream analyses.

So if you are doing global normalisation. First you want to do pA conversion, as this gets all the raw dac values into the same range, then when you normalise, I'd use med-mad.

I hope this answers your question. Let me know if I missed something or you have any other questions

James

chilampoon commented 2 years ago

I see, I'll do med-mad on pA values then, thank you so much James!

hasindu2008 / slow5tools

Are pA values from slow5tools normalized? #79