Vini2 / phables

🫧🧬 From fragmented assemblies to high-quality bacteriophage genomes
https://phables.readthedocs.io/
MIT License
67 stars 7 forks source link

[BUG] If a read non paired is found Phables crashed #18

Closed rdenise closed 1 year ago

rdenise commented 1 year ago

Describe the bug

If a read is not paired in the fastq file, phables v1.0.0 (installed with conda) will raise an error

To Reproduce

Steps to reproduce the behaviour, including the

  1. Command executed
phables run --input assembly_graphs/BSM001.gfa --reads test_set2/palaeofaeces/ --threads 20 --output phable_BSM001
  1. Error message
Traceback (most recent call last):
  File "/data/san/data1/users/remi/paleofeces_colab/phables.py", line 890, in <module>
    main()
  File "/data/san/data1/users/remi/paleofeces_colab/phables.py", line 168, in main
    junction_pe_coverage = get_junction_pe_coverage(bampath, output)
  File "/data/san/data1/users/remi/paleofeces_colab/phables_utils/coverage_utils.py", line 93, in get_junction_pe_coverage
    if read1.reference_name != read2.reference_name:
AttributeError: 'NoneType' object has no attribute 'reference_name'

Expected behavior

To have phables run all the way even if there is by mistake some non-paired reads.

If non-paired reads are not wanted for the analysis the phables should be able to ignore them

on the file phables_utils/coverage_utils.py you should change the function to handle this case:

def read_pair_generator(bam, region_string=None):
    """
    Generate read pairs in a BAM file or within a region string.
    Reads are added to read_dict until a pair is found.
    """
    read_dict = defaultdict(lambda: [None, None])

    for read in bam.fetch(region=region_string):
        if read.is_secondary or read.is_supplementary or not read.is_paired:
            continue
        qname = read.query_name

        if qname not in read_dict:
            if read.is_read1:
                read_dict[qname][0] = read
            else:
                read_dict[qname][1] = read
        else:
            if read.is_read1:
                yield read, read_dict[qname][1]
            else:
                yield read_dict[qname][0], read
            del read_dict[qname]

    return read_dict
Vini2 commented 1 year ago

Hello @rdenise,

Thank you for trying out Phables and posting this issue. I will fix it and add to the next release.

Vini2 commented 1 year ago

Closing issue after fixing in commit b7e4be09e70fcc5fdeb41b636f0ff9262dc21937.