adamewing / methylartist

Tools for plotting methylation data in various ways
MIT License
130 stars 10 forks source link

Failure to run wgmeth #81

Open IanCodes opened 4 months ago

IanCodes commented 4 months ago

Hello,

I am trying to run methylartist wgmeth v1.3.0 installed via conda. Linux platform.

The program generates an error shown below.

The data is a Nanopore run and the bam_pass reads were mapped using Dorado 0.7.0 to the human genome HG38 analysisSet from UCSC. The BAM includes modification data for m and h. The file is large 42G, I don't know if this poses a problem? I have been able to run 'region' and 'locus' using the BAM file below and two other of similar sizes.

I have tried various combinations of --motif (C and CG) and --mod (m or h separately). Not sure if you want both modification if you need to use '--mod m --mod h'?

Any thoughts on the cause of this?

Thank you. Ian

2024-06-12 09:29:04,737 starting methylartist with command: REDACTED_PATH/.conda/envs/methylartist/bin/methylartist wgmeth --primary_only -p 16 --motif CG --mod m --dss --bam GR3_20240326_BC5_bampass_folder_hg38as_070.bam --ref /mnt/data-sets/bcf/genomeIndexes/hg38_analysisSet/fasta/hg38.analysisSet.fa --fai /mnt/data-sets/bcf/genomeIndexes/hg38_analysisSet/fasta/hg38.analysisSet.fa.fai
2024-06-12 09:29:05,045 motif size 2 (CG)
2024-06-12 09:29:06,364 fetching mod types from GR3_20240326_BC5_bampass_folder_hg38as_070.bam, if this takes awhile MM/ML tags may be missing.
2024-06-12 09:29:06,604 found mods: h,m
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "REDACTED_PATH/.conda/envs/methylartist/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "REDACTED_PATH/.conda/envs/methylartist/bin/methylartist", line 1400, in get_meth_calls_wg
    seg_reads[index] = Read(index, cg_seg_start, stat, methstate, modname, phase=reads[index])
KeyError: '16624f9c-e1d2-4a49-8f71-c3b26ebfab0a'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "REDACTED_PATH/.conda/envs/methylartist/bin/methylartist", line 5755, in <module>
    main()
  File "REDACTED_PATH/.conda/envs/methylartist/bin/methylartist", line 5427, in main
    args.func(args)
  File "REDACTED_PATH/.conda/envs/methylartist/bin/methylartist", line 5327, in wgmeth
    seg_meth_table = res.get()
  File "REDACTED_PATH/.conda/envs/methylartist/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
KeyError: '16624f9c-e1d2-4a49-8f71-c3b26ebfab0a'
IanCodes commented 4 months ago

Any thoughts on this @adamewing?

adamewing commented 4 months ago

Does specifying --primary_only help?

IanCodes commented 4 months ago

@adamewing I have just added your suggestion to the above command line, but get exactly the same error.

IanCodes commented 4 months ago

@adamewing I have tried only using the chr21 part of the full BAM file and it completed OK. Chr21 BAM is 726M, whereas the full file is 42G. Could it be a memory issue?

adamewing commented 4 months ago

If you samtools view your.bam | grep 16624f9c-e1d2-4a49-8f71-c3b26ebfab0a, is the read on chr21?

IanCodes commented 4 months ago

It appears five times on chromosome 4 and once in chrUn_KI270334v1.

IanCodes commented 4 months ago

Just a nudge that I answered your question @adamewing Thank you.

Alan-Xu1 commented 2 months ago

Has there been any updates on this? Met the same problem. This problem has also seemed to be brought up #67 . Thanks!

adamewing commented 1 month ago

Sorry for the long silence here, hopefully fixed in 1.3.1 as part of a related issue.

IanCodes commented 1 month ago

Thanks @adamewing I will give it a try.