Open iromeo opened 1 year ago
@iromeo Could this relate to speed of the code? I looked at your solution and I think it looks fine (at least it would be no worse), but I wonder if it would be more robust to just use istringstream
and parse the number without any worries? It would incur the overhead of creating the object, copying the string, but the code already looks quite slow.
@iromeo I'll just pull and we can enhance this later if you think it's warranted.
@andrewdavidsmith I agree that is_number
is not optimal, but I believe that its optimizing will be minor optimization that will not affect the whole run time. I suppose that the rest part of the adjustment algorithm works much slower compared to is_number
(I mean cross correlation and liptak-stouffer test)
istringstream
AFAIU the is_number
check was introduced only in DNMTOOLS version and is missing in previous methpipe radmeth impl. The check is required because regression step could return NA instead of numbers. Thus we could just check the first char of string, if it is digit, then it is likely a valid number, otherwise some NA like string when we were unable to calculate p-value.
At the moment the whole adjustment step works pretty fast, compared to regression step or reads alignment. So as a user I don't see much sense in adjustment step optimization.
P.S: I could compare performance of several approaches on my WGBS samples.
@iromeo is there a reason this is still open or did we just forget to close it? I recall the problem was fixed, right?
I think it is ok to close it. I use methpipe with this fix (from my fork) and everything is ok.
As for performance issue - as I mentioned before the adjustment step is much faster compared to very slow regression step, so I don't think that anything is needed to be optimised here.
They’re still the open PR. I can try to merge it. The code has changed much, but it should be ok.On Jul 12, 2023, at 8:46 AM, Roman Chernyatchik @.***> wrote: I think it is ok to close it. I use methpipe with this fix (from my fork) and everything is ok. As for performance issue - as I mentioned before the adjustment step is much faster compared to very slow regression step, so I don't think that anything is needed to be optimised here.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
'methpipe' to 'dnmtools' code refactoring introduced a critical bug in
radmeth-adjust
. In 1.2.1 version & current master branch the adjust step replaces significant p-values written in scientific notation with '1.0' and thus hides best DMC and mark them as totally insignificant.Example: Fragment of DMC table after regression step:
After adjust step:
the bug is in
is_number
function, it isn't aware that number could have '-' and 'e' chars as it is scientific notation:Example DMC table attached. dmc.txt
I've implemented & tested a fix, will attach it as pull request