broadinstitute / longbow

Annotation and segmentation of MAS-seq data
https://broadinstitute.github.io/longbow/
BSD 3-Clause "New" or "Revised" License
20 stars 4 forks source link

processes issue #117

Open xc611 opened 2 years ago

xc611 commented 2 years ago

Requested 16 cores, #SBATCH --ntasks=16

cannot find Process-18 (?)

[INFO 2022-02-24 19:20:19 segment] Invoked via: longbow segment -t 16 [INFO 2022-02-24 19:20:19 annotate] Invoked via: longbow annotate -t 16 1395T.ccs.bam [INFO 2022-02-24 19:20:19 annotate] Running with 16 worker subprocess(es) [INFO 2022-02-24 19:20:19 segment] Running with 16 worker subprocess(es) [INFO 2022-02-24 19:20:19 segment] Using simple splitting mode. [INFO 2022-02-24 19:20:19 extract] Invoked via: longbow extract -o extracted.bam [INFO 2022-02-24 19:20:19 extract] Writing extracted read segments to: extracted.bam [INFO 2022-02-24 19:20:19 extract] Including 2 flanking bases. [INFO 2022-02-24 19:20:19 annotate] Annotating 2492060 reads Process Process-18: Traceback (most recent call last): File "/mnt/projects/CCR-SF/active/Software/tools/Anaconda/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/mnt/projects/CCR-SF/active/Software/tools/Anaconda/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/mnt/projects/CCR-SF/active/Software/tools/longbow/longbow-0.5.21/src/longbow/annotate/command.py", line 271, in _write_thread_fn segments = bam_utils.collapse_annotations(ppath) File "/mnt/projects/CCR-SF/active/Software/tools/longbow/longbow-0.5.21/src/longbow/utils/bam_utils.py", line 292, in collapse_annotations for i, seg in enumerate(path): TypeError: 'NoneType' object is not iterable

kvg commented 2 years ago

Thank you for this report. Can you supply us some further information regarding the array model you're using, and perhaps a small excerpt from the input BAM file that we can use for testing?

A-N-Other commented 2 years ago

Hi @kvg @jonn-smith ! I'm attempting to get longbow running on a dataset we've received today but have run into this same error. It does appear linked to the launching of the spawned processes and occurs with any number of processes I pass to -t ... e.g. here with -t 1:

conda create -n longbow python=3.7.9
conda activate longbow

# the latter two below are also required, I found, but aren't pulled for install automatically
pip install maslongbow ordered_set editdistance

longbow annotate \
  -t 1 \
  -p reads/m64045e_220830_173130.hifi_reads.bam.pbi \
  -o longbow/m64045e_220830_173130.annotated.bam \
  reads/m64045e_220830_173130.hifi_reads.bam

... dies with ...

[INFO 2022-09-01 10:55:09 annotate] Invoked via: longbow annotate -t 1 -f -p reads/m64045e_220830_173130.hifi_reads.bam.pbi -o longbow/m64045e_220830_173130.annotated.bam reads/m64045e_220830_173130.hifi_reads.bam
[INFO 2022-09-01 10:55:09 annotate] Running with 1 worker subprocess(es)
[INFO 2022-09-01 10:55:09 annotate] Annotating 1248705 reads
Progress:   0%|                                                       | 0/1248705 [00:00<?, ? read/s]
Process Process-3:
Traceback (most recent call last):
  File ".../miniconda3/envs/longbow/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File ".../miniconda3/envs/longbow/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File ".../miniconda3/envs/longbow/lib/python3.7/site-packages/longbow/annotate/command.py", line 272, in _write_thread_fn
    segments = bam_utils.collapse_annotations(ppath)
  File ".../miniconda3/envs/longbow/lib/python3.7/site-packages/longbow/utils/bam_utils.py", line 292, in collapse_annotations
    for i, seg in enumerate(path):
TypeError: 'NoneType' object is not iterable
[INFO 2022-09-01 10:58:55 annotate] Annotated 0 reads with 0 total sections.
[INFO 2022-09-01 10:58:55 annotate] Done. Elapsed time: 225.71s. Overall processing rate: 0.00 reads/s.

I've subsetted that BAM down to contain only 10 reads, which gives the same error. You can download it to play with here: https://www.dropbox.com/s/jhi0eexbmf2lh46/m64045e_220830_173130.hifi_reads.mini.bam?dl=0

jonn-smith commented 2 years ago

Hi @A-N-Other. Thanks for the info. I'll take a look in a bit.

In the meantime, can you try pulling the latest code from github and setting up your environment with that codebase? In the past week I've put in a few fixes that haven't yet been propagated down to our pip package. They won't address this issue, but you should have them to prevent other potential problems.

aaronwagen commented 2 years ago

Hi @jonn-smith, Thanks for looking into this. I work with @A-N-Other and was just hoping to followup to see if there was an update on this issue?

Many thanks

jonn-smith commented 2 years ago

Hi @A-N-Other and @aaronwagen . I haven't had a chance to circle back on this just yet. I'm getting some urgent analysis work done at the moment. As soon as I'm finished I'll jump back on this and run it to ground. Sorry for the delay.

jamesbayne commented 2 years ago

Hi @jonn-smith another collaborator of @A-N-Other and @aaronwagen here. Appreciate that you've been working on a separate matter but wondering if you have any idea of when you might get round to looking at this?

Thanks!

jonn-smith commented 2 years ago

@jamesbayne I've been away for two weeks and am just catching up on things. The analysis that I had to do is almost complete. I'll probably get a chance to investigate this in the next couple of weeks.