dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
72 stars 40 forks source link

step 3 fails with "IndexError(list index out of range)" #274

Closed isaacovercast closed 7 years ago

isaacovercast commented 7 years ago

Hi Issac,

I am getting the same error on step 3 but it could be for another reason. Let me know if I should open this as a different issue.

This is the first time I am running ipyrad on my data, the tutorial worked great. This is a denovo assembly.

Here is the info from my terminal and the last lines of my ipyrad_log.txt Macintosh-2:Coral_RADseq sara$ ipyrad -p params-coraltest2.txt -s 3 -f -d

Enabling debug mode


ipyrad [v.0.7.15] Interactive assembly and analysis of RAD-seq data

loading Assembly: coraltest2 from saved path: /Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2.json establishing parallel connection: host compute node: [4 cores] on Macintosh-2.local

Step 3: Clustering/Mapping reads [####################] 100% dereplicating | 0:04:18
[####################] 100% clustering | 2:06:16
[####################] 100% building clusters | 0:02:43
[####################] 100% chunking | 0:00:19
[####################] 100% aligning | 10:39:44

Encountered an unexpected error (see ./ipyrad_log.txt) Error message is below ------------------------------- IndexError(list index out of range) Macintosh-2:Coral_RADseq sara$

ipyrad_log.txt

Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_1.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_2.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_3.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_4.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_5.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_6.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_7.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA002.06_chunk_8.aligned'] 2017-10-14 11:36:10,595 pid=94724 [cluster_within.py] DEBUG skipping empty chunk - /Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_5.ali 2017-10-14 11:36:10,965 pid=94724 [cluster_within.py] INFO chunk ['/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_0.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_1.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_2.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_3.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_4.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_5.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_6.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_7.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_8.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA004.26_chunk_9.aligned'] 2017-10-14 11:36:11,367 pid=94721 [cluster_within.py] DEBUG skipping empty chunk - /Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_9.ali 2017-10-14 11:36:11,565 pid=94703 [assembly.py] ERROR IndexError(list index out of range) 2017-10-14 11:36:11,774 pid=94721 [cluster_within.py] INFO chunk ['/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_0.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_1.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_2.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_3.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_4.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_5.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_6.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_7.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_8.aligned', '/Volumes/LaCie/Coral_RADseq/SEdenovo/coraltest2-tmpalign/MA001.05_chunk_9.aligned'] 2017-10-14 11:36:12,932 pid=94703 [assembly.py] INFO interrupted engine 0 w/ SIGINT to 94721 2017-10-14 11:36:12,940 pid=94703 [assembly.py] INFO interrupted engine 1 w/ SIGINT to 94724 2017-10-14 11:36:12,946 pid=94703 [assembly.py] INFO interrupted engine 2 w/ SIGINT to 94722 2017-10-14 11:36:12,954 pid=94703 [assembly.py] INFO interrupted engine 3 w/ SIGINT to 94723 2017-10-14 11:36:13,960 pid=94703 [assembly.py] INFO shutting down engines 2017-10-14 11:36:25,229 pid=94703 [assembly.py] INFO finished shutdown 2017-10-14 11:36:25,354 pid=94703 [init.py] INFO debugging turned off

Thank you, Sara

isaacovercast commented 7 years ago

@0seastar0 This is almost certainly a different issue so i made a new ticket.

isaacovercast commented 7 years ago

On second thought this could also be the same issue. Please verify that you have enough free disk space for step 3. It is probably a disk space issue.

0seastar0 commented 7 years ago

I am running this analysis on an external hard drive with over 500GB of disk space available. I hope that's enough disk space. I only have 51 samples. I also tried mapping the same samples to a reference genome and that worked perfectly.

isaacovercast commented 7 years ago

500GB should be sufficient, but it depends on your raw data. how big are the files in the _edits directory? Are there any files in the clust* directory? How big are they? Can you email me the full log file?

TomaszSuchan commented 7 years ago

Dear All, I'm having similar issue with ipyrad 0.7.15 at the end of step 3 (alignment stage). Works fine with ipyrad 0.7.13.

The end of log file looks like that:

2017-10-17 17:09:40,076         pid=14597       [cluster_within.py]     INFO    chunk ['/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_0.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_1.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_2.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_3.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_4.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_5.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_6.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_7.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_8.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_hirtula_52150_chunk_9.aligned']
2017-10-17 17:10:15,709         pid=14595       [cluster_within.py]     INFO    chunk ['/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_0.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_1.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_2.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_3.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_4.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_5.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_6.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_7.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_8.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_fatua_52335_chunk_9.aligned']
2017-10-17 17:12:58,121         pid=14589       [cluster_within.py]     INFO    chunk ['/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_0.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_1.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_2.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_3.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_4.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_5.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_6.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_7.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_8.aligned', '/home/tsuchan/Projects/radz/radz-tmpalign/A_insularis_52439_chunk_9.aligned']
2017-10-17 17:12:58,195         pid=14376       [assembly.py]   ERROR   IndexError(list index out of range)
2017-10-17 17:12:59,260         pid=14376       [assembly.py]   INFO    interrupted engine 4 w/ SIGINT to 14589
2017-10-17 17:13:00,277         pid=14376       [assembly.py]   INFO      shutting down engines
2017-10-17 17:13:02,022         pid=14376       [assembly.py]   INFO      finished shutdown
2017-10-17 17:13:02,029         pid=14376       [__init__.py]   INFO    debugging turned off
isaacovercast commented 7 years ago

@0seastar0 Can you dropbox (or wetransfer) me your raw data and the params file you're using? I'm starting to think this isn't a disk issue.

isaacovercast commented 7 years ago

Fixed v.0.7.17

Empty .ali files were causing step 3 to raise, so i added code to test for this and just skip them if they're empty.