marbl / verkko

265 stars 27 forks source link

5-untip -> get_original_coverage.py -> IndexError #229

Closed IsaacDiaz026 closed 4 months ago

IsaacDiaz026 commented 4 months ago

Hello, and thanks for the awesome workflow.

I am hitting an error at 5-untip, specifically in the get_orignal_coverage.py script that is executed. Here is the error

.conda/envs/verkko/lib/verkko/scripts/get_original_coverage.py", line 92, in <module>
    path = parts[1].split(':')[0].replace('<', "\t<").replace('>', "\t>").strip().split('\t')
IndexError: list index out of range

I tracked the error back to this part of the get_original_coverage script : 

with open(mapping_file) as f:
    for l in f:
        parts = l.strip().split('\t')
        assert parts[0] not in mapping
        path = parts[1].split(':')[0].replace('<', "\t<").replace('>', "\t>").strip().split('\t')
        left_clip = int(parts[1].split(':')[1])
        right_clip = int(parts[1].split(':')[2])
        mapping[parts[0]] = (path, left_clip, right_clip)

original_coverages = {}

Where "mapping file "  is 5-untip/combined-nodemap-1.txt which looks like this 

head -n 5 combined-nodemap-1.txt
utig1-0 <100:0:0
utig1-1 >360:0:0
utig1-2 >361:0:0
utig1-3 <276:0:0
utig1-4 <475:0:0

However, looking closer at this file shows that some lines look like this : 

utig1-615       >745<gapthree-7-len--1139-cov-1<1057:0:0

AND 
utig1-1353      
<1867<unroll_1866_34<unroll_1866_33<unroll_1866_32<unroll_1866_31<unroll_1866_30<unroll_1866_29<unroll_1866_28<unroll_1866_27<unroll_1866_26<unroll_1866_25<unroll_1866_24<unroll_1866_23<unroll_1866_22<unroll_1866_21<unroll_1866_20<unroll_1866_19<unroll_1866_18<unroll_1866_17<unroll_1866_16<unroll_1866_15<unroll_1866_14<unroll_1866_13<unroll_1866_12<unroll_1866_11<unroll_1866_10<unroll_1866_9<unroll_1866_8<unroll_1866_7<unroll_1866_6<unroll_1866_5<unroll_1866_4<unroll_1866_3<unroll_1866_2<unroll_1866_1:0:0

Could these lines be causing the indexError?

skoren commented 4 months ago

The file entries like that are normal so I don't think that is the issue and all of them have two entries or parts[1] should be OK.

What's the version of verkko you're using? Can you share your 5-untip and 1-buildGraph folders?

IsaacDiaz026 commented 4 months ago

Just shared the folders, I am using verkko 1.41

skoren commented 4 months ago

This looks like an issue with your python setup. I see lines like this in the output files:

Error processing line 1 of /rhome/idiaz026/.local/lib/python3.9/site-packages/distutils-precedence.pth:

  Traceback (most recent call last):
    File "/opt/linux/rocky/8.x/x86_64/pkgs/miniconda3/py39_4.12.0/lib/python3.9/site.py", line 169, in addpackage
      exec(line)
    File "<string>", line 1, in <module>
  AttributeError: module '_distutils_hack' has no attribute 'add_shim'

Remainder of file ignored
Error processing line 1 of /opt/linux/rocky/8.x/x86_64/pkgs/miniconda3/py39_4.12.0/lib/python3.9/site-packages/distutils-precedence.pth:

  Traceback (most recent call last):
    File "/opt/linux/rocky/8.x/x86_64/pkgs/miniconda3/py39_4.12.0/lib/python3.9/site.py", line 169, in addpackage
      exec(line)
    File "<string>", line 1, in <module>
  AttributeError: module '_distutils_hack' has no attribute 'add_shim'

Remainder of file ignored

None of those are coming from verkko but from another python module you have installed, looks like maybe setup tools (https://github.com/pypa/setuptools/issues/2983). It is present in the untip.err log but it's also ending up in combined-nodemap-1.txt from unitig-mapping-3a.txt.

Once you fix this python issue, you should be able to re-run the untip.sh script and continue the assembly.