epi2me-labs / wf-human-variation

Other
86 stars 41 forks source link

Sniffles failing to complete with no error #192

Open parkerpayne opened 2 weeks ago

parkerpayne commented 2 weeks ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

v2.2.3

Workflow Execution

EPI2ME Desktop (Local)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

nextflow run epi2me-labs/wf-human-variation --out_dir {output_directory}/output -w {output_directory}/workspace -profile standard --snp --sv --str --cnv --bam {input_file} --ref {reference_file} --bam_min_coverage 0.01 --snp_min_af 0.25 --indel_min_af 0.25 --min_cov 10 --min_qual 10 --sex=XY --sample_name {run_name} --clair3_model_path {clair3_model_path} --depth_intervals --phased --threads {threads} --ubam_map_threads {threads} --ubam_sort_threads {threads} --ubam_bam2fq_threads {threads} --disable_ping

Workflow Execution - CLI Execution Profile

None

What happened?

This issue was something that I encountered after running the workflow for the first time after updating to the newest version (2.2.3) from 1.8.1. Upon updating and attempting to run the workflow, the process sv:variantCall:sniffles2 would start, but never make any progress. The workflow would never exit and would hang on this step indefinitely. In an attempt to fix it, I reinstalled Ubuntu even though the workflow is in a container, but after reinstalling it seemed to work for a little while before repeating the behavior. I let it run overnight so it ended up being stuck on the sniffles process for about 21 hours. There are no errors so I am not sure where to begin diagnosing this. Demo data works without issue.

Relevant log output

task_id hash    native_id   name    status  exit    submit  duration    realtime    %cpu    peak_rss    peak_vmem   rchar   wchar
6   9c/8f556e   3996143 ingress:checkBamHeaders (1) COMPLETED   0   2024-06-07 12:57:08.387 1.7s    222ms   84.2%   10.3 MB 14.3 MB 3.6 MB  1.1 MB
9   43/939aeb   3995987 sv:runReport:getParams  COMPLETED   0   2024-06-07 12:57:08.368 1.9s    2ms 123.1%  0   0   64.2 KB 3.5 KB
4   0d/37a3af   3995857 str:getParams   COMPLETED   0   2024-06-07 12:57:08.343 2s  3ms 84.2%   0   0   64.1 KB 3.4 KB
7   2b/24806c   3995960 cnv_spectre:getParams   COMPLETED   0   2024-06-07 12:57:08.366 2.1s    2ms 0.0%    0   0   67.2 KB 3.4 KB
2   66/bafaca   3996024 report_snp:getParams    COMPLETED   0   2024-06-07 12:57:08.372 2.1s    1ms 0.0%    0   0   64.2 KB 3.5 KB
...
1301    7e/4f54d0   2913213 cnv_spectre:annotate_vcf (1)    COMPLETED   0   2024-06-07 14:13:36.543 23.4s   22.7s   266.0%  4.7 GB  10.3 GB 322.6 MB    203.5 KB
1307    4c/f56bfd   2917156 output_cnv (2)  COMPLETED   0   2024-06-07 14:13:59.990 797ms   1ms 0.0%    0   0   60.8 KB 233 B
1306    11/9d1a23   2917162 output_cnv (1)  COMPLETED   0   2024-06-07 14:13:59.992 882ms   1ms 0.0%    0   0   60.8 KB 221 B
1305    71/0fb2f8   2915700 str:make_report (1) COMPLETED   0   2024-06-07 14:13:45.591 26.5s   25.9s   108.1%  1.1 GB  5.2 GB  1 GB    29.5 MB
1308    b8/2c1584   2918093 output_str (3)  COMPLETED   0   2024-06-07 14:14:12.102 733ms   5ms 116.4%  0   0   60.8 KB 237 B
1309    c5/0f59c0   2918089 output_str (4)  COMPLETED   0   2024-06-07 14:14:12.101 825ms   6ms 101.6%  0   0   60.8 KB 225 B
1294    2b/53c196   2909130 snp:cat_haplotagged_contigs (1) COMPLETED   0   2024-06-07 14:13:25.394 12m 51s 12m 50s 146.5%  4.7 GB  5.4 GB  169.8 GB    130.5 GB
1310    7a/696f54   2927417 sv:variantCall:sniffles2 (1)    FAILED  129 2024-06-07 14:26:16.653 21h 24m 11s 21h 24m 11s -   -   -   -   -

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

SamStudio8 commented 2 weeks ago

Hi @parkerpayne, do you have the .nextflow.log for this run? It might provide some more clues for us to take a look.

parkerpayne commented 2 weeks ago

Not currently, I checked all the logs I had and none of them were from that run. I'm not sure if that means one wasn't produced or if I somehow deleted it. I am running it again to attempt to produce one so I will update you once I find out.

SamStudio8 commented 2 weeks ago

@parkerpayne The .nextflow.log will be written to as the workflow progresses so should exist - it'll be incrementally renamed to .nextflow.log.1 ... .nextflow.log.9 each time a nextflow workflow is started in the same directory - so you might have a historical copy

parkerpayne commented 2 weeks ago

Interestingly, after starting the workflow a few times it only started producing a .nextflow.log file after deleting all other log files. I had about 6 in the directory and it made no log files until all were deleted. I will provide it once it reaches the sniffles process.

parkerpayne commented 2 weeks ago

After running it twice, I am unable to replicate the issue. I can create a new issue if the problem resurfaces. Thanks anyways!

parkerpayne commented 2 weeks ago

The issue has reappeared on a different computer, so I now have the nextflow log file. Behavior is exactly the same as before, stuck on [62/c3fa1d] process > sv:variantCall:sniffles2 (1) [ 0%] 0 of 1 .nextflow.log

SamStudio8 commented 2 weeks ago

@parkerpayne Thanks for the log - can you provide the contents of /home/grid/polarPipelineWork/123/workspace/62/c3fa1d1167848f2cee9aa6121b1e8e/.command.err and /home/grid/polarPipelineWork/123/workspace/62/c3fa1d1167848f2cee9aa6121b1e8e/.command.out?

parkerpayne commented 2 weeks ago

command.err.txt command.out.txt these are the files, the .err file does not contain anything.

SamStudio8 commented 2 weeks ago

@parkerpayne Thanks for those. This is curious and not something we have seen before. I'll open a ticket and poke around to see what could possibly cause this. In the meantime I don't have a better suggestion than using -resume to pick up the workflow and hope this can be avoided.

parkerpayne commented 2 weeks ago

Thank you for the help!