WGLab / DeepMod2

DNA 5mC methylation detection from Dorado or Guppy basecalled Oxford Nanopore reads
MIT License
32 stars 2 forks source link

some output files are missing #27

Closed sachingadakh closed 1 month ago

sachingadakh commented 2 months ago

Hello, I have performed RRMS Adaptive sampling and have a rather low coverage output in pod5 format, which I have aligned to the reference genome mentioned in your pipeline. But when I run "deepmod2 detect" to call modifications, the command doesn't generate two files among four, such as output.per_site and output.per_site.aggregated are not produced. I am running the following command : python deepmod2 detect --bam sarcoma_tumour_P01_basecalled_aligned.bam --input sarcoma_tumour_PO1.pod5 --model bilstm_r10.4.1_5khz_v4.3 --file_type pod5 --seq_type dna --threads 30 --ref ../hg38.fa --output deepmode2/.

This command only produces output.bam and output.per_read, and not output.per_site and output.per_site.aggregated. Let me know what else information I should provide. I will be grateful for any insights to resolve this issue Thank you

umahsn commented 1 month ago

Hi, are you using --skip_per_site parameter when running DeepMod2? Because that would skip the creation of these files. Can you check the "args" within output folder to see the exact command used by DeepMod for that particular run and also check whether you see "skip_per_site: False" written in the args file?

You can also use python deepmod2 merge to regenerate the output.per_site and output.per_site.aggregated files from output.per_read file if you do not want to rerun the detect module.

sachingadakh commented 1 month ago

Hello, Thank you for responding. I checked the args text file, and indeed it has "skip_per_site: False" given its default behavior. Thus further as per your suggestion I ran following merge command to regenerate the output.per_site and output.per_site.aggregated files from the output.per_read, but then produced utput.per_site and output.per_site.aggregated files are empty and received the following error :

command : python /home/rakieta/DeepMod2/deepmod2 merge --input output.per_read --output output.per_site Starting DeepMod2. Starting Per Site Methylation Detection. Reading 1 files. 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:16<00:00, 16.92s/it] 2024-08-06 14:59:09.761821: Writing Per Site Methylation Detection. Traceback (most recent call last): File "/home/rakieta/DeepMod2/deepmod2", line 123, in site_pred_file=utils.get_per_site(params, input_list) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/rakieta/DeepMod2/src/utils.py", line 414, in get_per_site agg_stats, fwd_stats, rev_stats=get_stats_string(chrom, pos, is_ref_cpg, cpg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: not enough values to unpack (expected 3, got 2)

I checked the methylation score column to see if it has a range of values to see methylation status, and indeed, several reads have methylation sites, which the output.sort.bam file also confirmed while observing on IGV. I think this could be an issue of low coverage of my sample. Is there a way to produce a site by reducing the coverage parameter in the case used by deepmod2? Let me know what else information I should provide further, or is it not a coverage issue but something else? Thank you

umahsn commented 1 month ago

Hi,

I fixed some bugs in the merge command at least, so that should work now. If you want per site outputs for both aggregated CpGs and stranded CpGs, use:

python /home/rakieta/DeepMod2/deepmod2 merge --input output.per_read --output output.per_site --cpg_output

If you want only stranded the stranded pileups, use the above command without --cpg_output

sachingadakh commented 1 month ago

Hello, Thank you for the response. I run the command as you mentioned but with --cpg_output, received following error : usage: deepmod2 [-h] [--print_models] {detect,merge} ... deepmod2: error: unrecognized arguments: --cpg_output

but without --cpg_output, only with : python /home/rakieta/DeepMod2/deepmod2 merge --input output.per_read --output output.per_site

I received the same error as earlier : python /home/rakieta/DeepMod2/deepmod2 merge --input output.per_read --output output.per_site 2024-08-11 14:15:35.176286: Starting DeepMod2. 2024-08-11 14:15:35.176369: Starting Per Site Methylation Detection. 2024-08-11 14:15:35.176377: Reading 1 files. 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:17<00:00, 17.22s/it] 2024-08-11 14:15:52.404593: Writing Per Site Methylation Detection. Traceback (most recent call last): File "/home/rakieta/DeepMod2/deepmod2", line 123, in site_pred_file=utils.get_per_site(params, input_list) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/rakieta/DeepMod2/src/utils.py", line 414, in get_per_site agg_stats, fwd_stats, rev_stats=get_stats_string(chrom, pos, is_ref_cpg, cpg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: not enough values to unpack (expected 3, got 2)

umahsn commented 1 month ago

Please pull the latest code with the updated parameter using git pull inside the repository folder.

On Sun, Aug 11, 2024, 8:22 AM sachingadakh @.***> wrote:

Hello, Thank you for the response. I run the command as you mentioned but with --cpg_output, received following error : usage: deepmod2 [-h] [--print_models] {detect,merge} ... deepmod2: error: unrecognized arguments: --cpg_output

but without --cpg_output, only with : python /home/rakieta/DeepMod2/deepmod2 merge --input output.per_read --output output.per_site

I received the same error as earlier : python /home/rakieta/DeepMod2/deepmod2 merge --input output.per_read --output output.per_site 2024-08-11 14:15:35.176286: Starting DeepMod2. 2024-08-11 14:15:35.176369: Starting Per Site Methylation Detection. 2024-08-11 14:15:35.176377: Reading 1 files. 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:17<00:00, 17.22s/it] 2024-08-11 14:15:52.404593: Writing Per Site Methylation Detection. Traceback (most recent call last): File "/home/rakieta/DeepMod2/deepmod2", line 123, in site_pred_file=utils.get_per_site(params, input_list) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/rakieta/DeepMod2/src/utils.py", line 414, in get_per_site agg_stats, fwd_stats, rev_stats=get_stats_string(chrom, pos, is_ref_cpg, cpg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: not enough values to unpack (expected 3, got 2)

— Reply to this email directly, view it on GitHub https://github.com/WGLab/DeepMod2/issues/27#issuecomment-2282742131, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIRI4S75PFGC7F2FXYIGWXTZQ5JQNAVCNFSM6AAAAABLZ44LISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBSG42DEMJTGE . You are receiving this because you commented.Message ID: @.***>

sachingadakh commented 1 month ago

Ohhh sorry it didn't occur me to do that. I will do it tomorrow earliest and will get back to you. Thank you

sachingadakh commented 1 month ago

Hello, The issue is resolved. Thank you very much. Sachin