Closed RisaWatanabe closed 2 days ago
Yes, there is, and its purpose is to avoid calling nucleosomes or MSPs whose starts or ends likely extend off of the Fiber. This is an adjustable parameter (-d, --distance-from-end
) in ft add-nucleosomes
:
ft add-nucleosomes --help
Add nucleosomes to a bam file with m6a predictions
Usage: ft add-nucleosomes [OPTIONS] [BAM] [OUT]
Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A calls [default: -]
[OUT] Output bam file with nucleosome calls [default: -]
Options:
-n, --nucleosome-length <NUCLEOSOME_LENGTH> Minium nucleosome length [default: 75]
-c, --combined-nucleosome-length <COMBINED_NUCLEOSOME_LENGTH> Minium nucleosome length when combining over a single m6A [default: 100]
--min-distance-added <MIN_DISTANCE_ADDED> Minium distance needed to add to an already existing nuc by crossing an m6a [default: 25]
-d, --distance-from-end <DISTANCE_FROM_END> Minimum distance from the end of a fiber to call a nucleosome or MSP [default: 45]
-h, --help Print help
-V, --version Print version
BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-x, --ftx <FILTER_EXPRESSION> Filtering expression to use for filtering records Example: filter to nucleosomes with lengths greater than 150 bp -x "len(nuc)>150" Example: filter to msps with lengths between 30
and 49 bp -x "len(msp)=30:50" Example: combine 2+ filter expressions -x "len(nuc)<150,len(msp)=30:50" Filtering expressions support len() and qual() functions over msp, nuc, m6a,
cpg
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [env: FT_MIN_ML_SCORE=] [default: 125]
Global-Options:
-t, --threads <THREADS> Threads [default: 8]
Debug-Options:
-v, --verbose... Logging level [-v: Info, -vv: Debug, -vvv: Trace]
--quiet Turn off all logging
Thank you Mitchel for the prompt answer. I ran 'ft predict-m6a' with default parameters to call NUCs and MSPs. The default value for the --distance-from-end parameter is 45bp, but in my previous example, both the last NUC and MSP were located more than 1kb from the end of the fiber. Is this behavior expected? Could it be that a NUC, which was called after the last MSP, was extending off the end and was thus removed from the output?
Yes, this situation can happen if, after I have predicted the positions of MSPs or NUCs, if they either start or end in the last 45bp, they are removed. This is because, in this situation, we believe there is a relatively high probability we have not observed the actual start or end.
Hi, I noticed that the MSPs and NUCs are not called at the edge of fibers. Below is an example of such fiber.
Is there a specific rule in fibertools-rs to not call any MSP/NUC at the edge of fibers?
Thanks.