Closed ninavie closed 1 year ago
The mappings.bam files were created with Dorado. In order to perform signal analyses the --emit-moves
option must be provided. For reference anchored plots the reads must also be mapped, so the --reference
argument should be specified to Dorado. If mapping reads after basecalling, special instructions are needed (see the data preparation section of the README) The ref_regions.bed files represented the regions of the reference at which you would like to plot signal. Note that these should be quite short to useful plots. I'd suggest starting with 10-30 bases. The mod_gt.bed files is the ground truth locations of modified bases in the mod_mappings.bam file. If you do not have a ground truth here then this file can be omitted.
If you could provide the specific errors that would be very helpful in assisting with resolving the issues.
Hi @marcus1487 Thanks for your help!
This worked for us to make the analysis plots work within the notebook metrics_api.ipynb In case somebody else is faced with the same issue, here our pipeline for processing the input data (in addition to the .pod5 files):
dorado basecaller <model> <.pod5 directory> --emit-moves --reference <reference.fasta> > <output .bam>
samtools sort <basecalled_mapped.bam> -o <sorted.bam>
samtools index <sorted.bam>
Hi,
We are interested in the function of raw signal analysis with Remora. Specifically the command analyze plot ref_region. Becasue we are new to handling this type of data we first ran the command with the provided data (tests/data). In this case everything worked fine. After adding our own data we had multiple problems and errors.
We are not sure if our pre processed data fits the needs for this kind of analysis. Could you give us some explanation how to generate the following files correctly. can_mappings.bam mod_mappings.bam ref_regions.bed mod_gt.bed
Thanks for your help!