Closed loganylchen closed 5 months ago
Hi
Yeh, these default parameters are coming from nanopolish/scrappie. Do you require these options to be exposed for exploratory purposes or is it for use in a routine pipeline? If it is just for exploratory purposes, I can do a quick hacky implementation to expose those parameters in a separate branch for now. If it is for routine use, will have to spend some time to properly do it and some testing.
These defaults for R9.4 are perhaps not the best for R10.4 data as the signal is slightly different to R9.4. So tuning these parameters in theory could improve the alignments for R10.4 - I cannot find time to optimise these. Are you working on R10.4?
Hi @hasindu2008 ,
For now, it is just for some exploratory purposes. It would be very grateful if you help me expose the parameters in a separate branch.
I am now not working on the R10.4, but I will work on it if I find tuning these numbers could help alignments.
Best,
I think I will catch you up if I find specifying these parameters is necessary for some different scenarios.
Will something like _--edparam window_length1,window_length2,threshold1,threshold2,peakheight would do?
Sure. It should support both RNA and DNA, right?
Yes, if you pass --edparam option to eventalign, the defaults will be overridden by what the user provides. I have quickly implemented here. Have not tested so let me know if something is off. https://github.com/hasindu2008/f5c/tree/edparam
Usage: --edparam comma separated list of parameters (window_length1,window_length2,threshold1,threshold2,peak_height) for event detection
Hi @hasindu2008 ,
I've tried the branch, and it works well. However, I still encountered an issue, and need your help. As I specified the window_length1 to a smaller number, 3 or 5 for example, many reads will be labeled as qc failed
. I knew it was caused by the events_per_base, which indicated the quality of the alignment. But as I set a smaller window size, the event should be over-segmented, and will be more events that represented one base.
So I am asking if I could also specify this 5.0
in the arguments?
In the f5c.c
if (db->events_per_base[i] > 5.0) {
// events[0].clear();
// events[1].clear();
//free(db->event_align_pairs[i]);
db->read_stat_flag[i] |= FAILED_QUALITY_CHK;
return;
}
Best
Hi, try the newly added --max-events-base
I will close this issue for now. Feel free to reopen if you have any more questions/requests.
Hi there,
I have some questions about the event detection, which I knew was adopted from nanopolish/scrappie. I found the default parameters were different between RNA and DNA event detection.
I could quite understand the reasons: 1. RNA molecules translocate much slower than the DNA (larger window length), 2. RNA signals were more fluctuation (higher threshold and peak_height). Correct me if I were wrong.
I think the settings were quite reasonable. I raised the issue here just to ask for a feature in the "f5c eventalign" function, which I could specify these default parameters. I knew I could fork the repo and modify the default numbers and then compile to binary f5c to do the test, but I think exposing these numbers to the command line args will help me a lot, reducing manual works.
Best, Logan