dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

Message: IndexError: string index out of range #504

Closed vherklotz closed 1 year ago

vherklotz commented 1 year ago

Hello, please can you help? My ipyrad was recently installed with all latest version. After finishing "aligning clusters" in step 3 I receive this error:

Encountered an Error. Message: IndexError: string index out of range Parallel connection closed.

IndexError Traceback (most recent call last) File /cluster/software/ipyrad/ipyrad-0.9.85/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1488, in persistent_popen_align3(clusts, maxseqs, is_gbs) 1485 lclust1 = list(chain(zip( 1486 lclust[::2], [i.split("nnnn")[0] for i in lclust[1::2]]))) 1487 lclust2 = list(chain(zip( -> 1488 lclust[::2], [i.split("nnnn")[1] for i in lclust[1::2]]))) 1490 # put back into strings

File /cluster/software/ipyrad/ipyrad-0.9.85/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1488, in (.0) 1485 lclust1 = list(chain(zip( 1486 lclust[::2], [i.split("nnnn")[0] for i in lclust[1::2]]))) 1487 lclust2 = list(chain(zip( -> 1488 lclust[::2], [i.split("nnnn")[1] for i in lclust[1::2]]))) 1490 # put back into strings

IndexError: list index out of range

During handling of the above exception, another exception occurred:

IndexError Traceback (most recent call last) File :1

File /cluster/software/ipyrad/ipyrad-0.9.85/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1397, in align_and_parse(handle, max_internal_indels, is_gbs, declone) 1394 nwodups = 0 1396 # iterate over clusters sending each to muscle, splits and aligns pairs -> 1397 aligned = persistent_popen_align3(clusts, 200, is_gbs) 1399 # store good alignments to be written to file 1400 refined = []

File /cluster/software/ipyrad/ipyrad-0.9.85/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1573, in persistent_popen_align3(clusts, maxseqs, is_gbs) 1569 lines = "".join(align1)[1:].split("\n>") 1571 ## find seed of the cluster and put it on top. 1572 #seed = [i for i in lines if i.split(";")[-1][0] == ""][0] -> 1573 seed = [i for i in lines if i.split('\n')[0][-1] == ""][0] 1574 lines.pop(lines.index(seed)) 1575 lines = [seed] + sorted( 1576 lines, key=get_derep_num, reverse=True)

File /cluster/software/ipyrad/ipyrad-0.9.85/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1573, in (.0) 1569 lines = "".join(align1)[1:].split("\n>") 1571 ## find seed of the cluster and put it on top. 1572 #seed = [i for i in lines if i.split(";")[-1][0] == ""][0] -> 1573 seed = [i for i in lines if i.split('\n')[0][-1] == ""][0] 1574 lines.pop(lines.index(seed)) 1575 lines = [seed] + sorted( 1576 lines, key=get_derep_num, reverse=True)

IndexError: string index out of range

isaacovercast commented 1 year ago

There is a known issue with the new version of muscle (>=v5). You can fix this by installing the 3.8 version of muscle within your ipyrad environment: conda install -c bioconda muscle=3.8

isaacovercast commented 1 year ago

the muscle version should be pinned in the bioconda recipe, so I don't think this should crop up often in the wild. Did a clean ipyrad install:

  muscle             bioconda/linux-64::muscle-3.8.1551-h7d875b9_6 
isaacovercast commented 12 months ago

FML. this bit me again. We should fix this. Wasted several hours again today on the same error.

Really it's not a bioconda recipe issue because the bioconda recipe is correctly pinning muscle < v5, its an environment issue where if muscle gets updated at some point to be >v5 then ipyrad will crash on PE data.

Maybe we should put in a check on ipyrad launch to verify muscle version is correct.

isaacovercast commented 12 months ago

Added a test in ipyrad/init.py to check for this because it is ANNNOOOOYYYININNNGGGGGGGG

41c25780