Closed isaacovercast closed 1 year ago
Here's my solution. If all R2 seqs are empty then clip off the 'nnnn' and treat it as R1 or merged data. I'm testing it now.
Inside ipyrad.assemble.clustmap_across.align_to_array():L1349-1362:
# else locus looks good, align it.
# is there a paired-insert in any samples in the locus?
try:
# try to split cluster list at nnnn separator for each read
left = [i.split("nnnn")[0] for i in seqs]
right = [i.split("nnnn")[1] for i in seqs]
if not any(right):
# If _all_ R2 seqs are empty then raise the IndexError
# and treat it as R1 only. Insane edge case, took one entire
# day to figure out. iao 9/15/22
seqs = left
raise IndexError()
Pretty crazy that this has never come up before....
jupyter notebook for debugging this problem, in case it's ever useful.
Hi, I have met the same problem like this. But I didn't really understand how to use this file to solve it. Do I need to run this file '[Step6-ipynb.md]' in my bug-jupyter notebook? Is there any code? Can you help me to explain more? Thanks a lot! ~~
@Juliazhou1994 Are you sure it's the same problem? Can you run step 6 with the -d
flag and post the full output here?
Hi, here is my log file
Parallel connection | nku-PowerEdge-T640: 60 cores
[####################] 100% 1:22:30 | processing reads | s2 |
[####################] 100% 1:02:15 | join merged pairs | s3 |
[####################] 100% 0:52:51 | join unmerged pairs | s3 |
[####################] 100% 0:47:36 | dereplicating | s3 |
[####################] 100% 10 days, 8:23:38 | clustering/mapping | s3 |
[####################] 100% 0:00:43 | building clusters | s3 |
[####################] 100% 0:00:09 | chunking clusters | s3 |
[####################] 100% 18:46:03 | aligning clusters | s3 |
[####################] 100% 0:01:41 | concat clusters | s3 |
[####################] 100% 0:01:11 | calc cluster stats | s3 |
[####################] 100% 0:04:47 | inferring [H, E] | s4 |
[####################] 100% 0:00:44 | calculating depths | s5 |
[####################] 100% 0:01:01 | chunking clusters | s5 |
[####################] 100% 1:30:53 | consens calling | s5 |
[####################] 100% 0:02:16 | indexing alleles | s5 |
[####################] 100% 0:02:31 | concatenating inputs | s6 |
[####################] 100% 13:26:50 | clustering across | s6 |
[####################] 100% 0:01:13 | building clusters | s6 |
[####################] 100% 0:24:48 | aligning clusters | s6 |
Encountered an Error.
Message: ValueError: dictionary update sequence element #0 has length 1; 2 is required
---------------------------------------------------------------------------ValueError Traceback (most recent call last)
What version of ipyrad are you using? ipyrad -v
. The error message that you show here shows the line number of the problem (1384) which looks like it is not the current line number for that part of the code, which leads me to believe you are using an older version. I believe this problem was fixed in v0.9.85, so please update to the most recent version of ipyrad and try again.
conda update -c bioconda ipyrad
This was reported by @alexkrohn on gitter.
I spent way too much time trying to figure out what was causing this. TL;DR if you have PE data and all the R2s are blank then you get a cluster like this:
And after splitting on 'nnnn' and then passing the R2 seqs to muscle_it(), muscle throws an error (
*** ERROR *** No sequences in input file
) because the seqs list is all empty (in this case ['','']). The error message goes to stderr, so python doesn't see it, and the problem cascades down a bit and then shows up like this (which is a little confusing):