dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
72 stars 40 forks source link

Step 7 error #335

Closed ghost closed 4 years ago

ghost commented 5 years ago

Hi all!

I am stuck with the error below in step 7. Not sure what it is. I have seen others having similar errors associated with the pops_file, which I don't have.

I do merge multiple libraries after step 1 (multiple lanes) and subsample for independent small datasets. Rest of steps until 7 run well and stats look generally quite good.

################## Encountered an error (see details in ./ipyrad_log.txt) Error summary is below ------------------------------- error in filter_stacks on chunk 0: IndexError(index 1 is out of bounds for axis 0 with size 1) ##################

I'd appreciate insights about how to solve this, please.

Thanks a lot, gerard

kindofausername commented 5 years ago

It looks like that some samples in the subsampled dataset will end up having no loci shared with the other samples. Could you try to rerun step 7 with minCov = 4?

ghost commented 5 years ago

I have tried minCov 4 and lower. Same problem. Also the same happens for any different subsampling. I have the impression that is not an issue of the data but related to the merging and subsampling process.

isaacovercast commented 5 years ago

What version of ipyrad are you running?

isaacovercast commented 5 years ago

Can you run step 7 with the -d flag and include the last 30-40 lines of the ipyrad_log.txt file. Can you post your parameters. Can you post output from ipyrad -p <your_params> -r. More information will help solve the problem.

ghost commented 5 years ago

Thanks Isaac!

I am using ipyrad v.0.7.30

This is log file output:

2019-04-17 15:34:56,555 pid=30088 [write_outfiles.py] INFO passed minhet 171808 2019-04-17 15:34:56,936 pid=30088 [write_outfiles.py] INFO Entering filter_stacks 2019-04-17 15:34:57,778 pid=30086 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:34:57,968 pid=30086 [write_outfiles.py] INFO passed edges 173264 2019-04-17 15:34:58,026 pid=30086 [write_outfiles.py] INFO passed minfilt 173264 2019-04-17 15:34:58,093 pid=30086 [write_outfiles.py] INFO --------------maxhet mins [1 1 1 ... 1 1 1] 2019-04-17 15:34:58,432 pid=30086 [write_outfiles.py] INFO --------------maxhet sums 24 2019-04-17 15:34:58,432 pid=30086 [write_outfiles.py] INFO passed minhet 173264 2019-04-17 15:34:58,859 pid=30086 [write_outfiles.py] INFO Entering filter_stacks 2019-04-17 15:35:28,600 pid=30084 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:35:28,804 pid=30084 [write_outfiles.py] INFO passed edges 174720 2019-04-17 15:35:28,860 pid=30084 [write_outfiles.py] INFO passed minfilt 174720 2019-04-17 15:35:28,911 pid=30084 [write_outfiles.py] INFO --------------maxhet mins [1 1 1 ... 1 2 1] 2019-04-17 15:35:29,081 pid=30087 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:35:29,244 pid=30084 [write_outfiles.py] INFO --------------maxhet sums 32 2019-04-17 15:35:29,245 pid=30084 [write_outfiles.py] INFO passed minhet 174720 2019-04-17 15:35:29,269 pid=30087 [write_outfiles.py] INFO passed edges 176176 2019-04-17 15:35:29,323 pid=30087 [write_outfiles.py] INFO passed minfilt 176176 2019-04-17 15:35:29,383 pid=30087 [write_outfiles.py] INFO --------------maxhet mins [1 1 1 ... 1 1 1] 2019-04-17 15:35:29,645 pid=30087 [write_outfiles.py] INFO --------------maxhet sums 17 2019-04-17 15:35:29,645 pid=30087 [write_outfiles.py] INFO passed minhet 176176 2019-04-17 15:35:29,733 pid=30084 [write_outfiles.py] INFO Entering filter_stacks 2019-04-17 15:35:29,931 pid=30088 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:35:30,145 pid=30088 [write_outfiles.py] INFO passed edges 177632 2019-04-17 15:35:30,148 pid=30087 [write_outfiles.py] INFO Entering filter_stacks 2019-04-17 15:35:30,196 pid=30088 [write_outfiles.py] INFO passed minfilt 177632 2019-04-17 15:35:30,247 pid=30088 [write_outfiles.py] INFO --------------maxhet mins [1 1 1 ... 2 1 1] 2019-04-17 15:35:30,537 pid=30088 [write_outfiles.py] INFO --------------maxhet sums 24 2019-04-17 15:35:30,537 pid=30088 [write_outfiles.py] INFO passed minhet 177632 2019-04-17 15:35:31,076 pid=30088 [write_outfiles.py] INFO Entering filter_stacks 2019-04-17 15:35:31,857 pid=30086 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:35:32,051 pid=30086 [write_outfiles.py] INFO passed edges 179088 2019-04-17 15:35:32,105 pid=30086 [write_outfiles.py] INFO passed minfilt 179088 2019-04-17 15:35:32,160 pid=30086 [write_outfiles.py] INFO --------------maxhet mins [1 1 1 ... 1 1 1] 2019-04-17 15:35:32,424 pid=30086 [write_outfiles.py] INFO --------------maxhet sums 23 2019-04-17 15:35:32,424 pid=30086 [write_outfiles.py] INFO passed minhet 179088 2019-04-17 15:35:32,904 pid=30086 [write_outfiles.py] INFO Entering filter_stacks 2019-04-17 15:36:01,741 pid=30084 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:36:01,825 pid=30087 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:36:01,945 pid=30084 [write_outfiles.py] INFO passed edges 180544 2019-04-17 15:36:01,998 pid=30084 [write_outfiles.py] INFO passed minfilt 180544 2019-04-17 15:36:02,044 pid=30087 [write_outfiles.py] INFO passed edges 182000 2019-04-17 15:36:02,064 pid=30084 [write_outfiles.py] INFO --------------maxhet mins [2 1 2 ... 1 2 1] 2019-04-17 15:36:02,108 pid=30087 [write_outfiles.py] INFO passed minfilt 182000 2019-04-17 15:36:02,154 pid=30086 [write_outfiles.py] INFO superints shape (1346, 139, 358) 2019-04-17 15:36:02,167 pid=30087 [write_outfiles.py] INFO --------------maxhet mins [1 2 1 ... 2 1 1] 2019-04-17 15:36:02,240 pid=30084 [write_outfiles.py] INFO --------------maxhet sums 10 2019-04-17 15:36:02,241 pid=30084 [write_outfiles.py] INFO passed minhet 180544 2019-04-17 15:36:02,355 pid=30086 [write_outfiles.py] INFO passed edges 184912 2019-04-17 15:36:02,417 pid=30086 [write_outfiles.py] INFO passed minfilt 184912 2019-04-17 15:36:02,521 pid=30086 [write_outfiles.py] INFO --------------maxhet mins [1 1 1 ... 1 1 1] 2019-04-17 15:36:02,566 pid=30087 [write_outfiles.py] INFO --------------maxhet sums 23 2019-04-17 15:36:02,567 pid=30087 [write_outfiles.py] INFO passed minhet 182000 2019-04-17 15:36:02,999 pid=30086 [write_outfiles.py] INFO --------------maxhet sums 27 2019-04-17 15:36:02,999 pid=30086 [write_outfiles.py] INFO passed minhet 184912 2019-04-17 15:36:03,169 pid=30088 [write_outfiles.py] INFO superints shape (1456, 139, 358) 2019-04-17 15:36:03,385 pid=30088 [write_outfiles.py] INFO passed edges 183456 2019-04-17 15:36:03,440 pid=30088 [write_outfiles.py] INFO passed minfilt 183456 2019-04-17 15:36:03,495 pid=30088 [write_outfiles.py] INFO --------------maxhet mins [1 1 1 ... 1 1 1] 2019-04-17 15:36:03,717 pid=30088 [write_outfiles.py] INFO --------------maxhet sums 18 2019-04-17 15:36:03,717 pid=30088 [write_outfiles.py] INFO passed minhet 183456 2019-04-17 15:36:03,888 pid=30053 [write_outfiles.py] ERROR error in filter_stacks on chunk 0: IndexError(index 1 is out of bounds for axis 0 with size 1) 2019-04-17 15:36:03,889 pid=30053 [write_outfiles.py] INFO finished filtering 2019-04-17 15:36:03,890 pid=30053 [assembly.py] ERROR IPyradWarningExit: error in filter_stacks on chunk 0: IndexError(index 1 is out of bounds for axis 0 with size 1) 2019-04-17 15:36:06,195 pid=30053 [assembly.py] INFO shutting down engines 2019-04-17 15:36:06,245 pid=30053 [assembly.py] INFO finished shutdown 2019-04-17 15:36:06,256 pid=30053 [init.py] INFO debugging turned off

ghost commented 5 years ago

And these are the stats

Summary stats of Assembly dataset_1

state reads_raw ... error_est reads_consens 6 594637 ... 0.000869 14750 6 1102673 ... 0.001308 22402 6 800442 ... 0.001246 16191 6 521067 ... 0.001703 11906 6 2054005 ... 0.000816 23948 6 804376 ... 0.001378 21386 6 919314 ... 0.001371 22603 6 884043 ... 0.001335 22412 6 854460 ... 0.001365 21887 6 1519497 ... 0.000711 16341 6 1217349 ... 0.001251 20647 6 1294878 ... 0.001193 20543 6 2266 ... 0.003221 3 6 1532268 ... 0.001155 22852 6 1326697 ... 0.001144 20925 6 1010757 ... 0.000761 16993 6 627612 ... 0.000912 13090 6 998873 ... 0.000988 13434 6 848633 ... 0.000864 14474 6 562567 ... 0.000791 14038 6 1293388 ... 0.000703 16222 6 729398 ... 0.000815 18730 6 767017 ... 0.000762 15311 6 831358 ... 0.001057 12864 6 1025424 ... 0.001043 13450 6 1074832 ... 0.001043 14865 6 937517 ... 0.000713 15217 6 1013813 ... 0.000783 17205 6 1133854 ... 0.001184 19095 6 1024800 ... 0.000821 18352 6 1061554 ... 0.001171 18970 6 1286747 ... 0.000771 19179 6 747755 ... 0.000774 14560 6 1390307 ... 0.000750 18729 6 950117 ... 0.000712 26235 6 669834 ... 0.000722 19600 6 971737 ... 0.000818 15581 6 1136929 ... 0.000727 15994 6 985279 ... 0.000801 15262 6 5721 ... 0.001651 10 6 1777220 ... 0.000844 23269 6 1057744 ... 0.001161 18783 6 1054676 ... 0.000694 13761 6 1185273 ... 0.000982 13775 6 1107975 ... 0.000725 13512 6 876182 ... 0.001080 15586 6 611065 ... 0.001171 14319 6 1009541 ... 0.001280 15475 6 1808350 ... 0.001280 15759 6 1307571 ... 0.001293 14175 6 619189 ... 0.000904 14569 6 758008 ... 0.000928 15170 6 654280 ... 0.000933 14621 6 328045 ... 0.001025 10498 6 783262 ... 0.000724 23677 6 699327 ... 0.000718 18321 6 738206 ... 0.000683 22985 6 549349 ... 0.000841 16836 6 1078894 ... 0.000763 16803 6 988502 ... 0.000768 16571 6 1066923 ... 0.000827 16969 6 930120 ... 0.000728 13105 6 937172 ... 0.000734 13146 6 1320205 ... 0.000757 13845 6 1134182 ... 0.000702 13458 6 1557888 ... 0.000714 17380 6 1387364 ... 0.000682 16315 6 1420646 ... 0.001137 15536 6 1125909 ... 0.000745 18072 6 1285674 ... 0.000658 19159 6 944144 ... 0.000733 19022 6 1484924 ... 0.000701 19189 6 1401636 ... 0.000662 18788 6 1233188 ... 0.001053 12262 6 693827 ... 0.001174 12932 6 1040656 ... 0.000736 17270 6 1539959 ... 0.001088 12771 6 1552093 ... 0.000647 16118 6 875685 ... 0.000726 12967 6 1445275 ... 0.001097 13245 6 1026749 ... 0.000742 13678 6 1109 ... 0.002749 1 6 746347 ... 0.000778 12742 6 1290331 ... 0.000683 13884 6 993742 ... 0.000730 13398 6 786419 ... 0.000807 13617 6 1578297 ... 0.001059 13072 6 5831 ... 0.006391 8 6 766 ... 0.002211 2 6 1094407 ... 0.000779 18304 6 1105329 ... 0.001309 22969 6 1015729 ... 0.001324 22497 6 1156027 ... 0.001310 23179 6 1296014 ... 0.001308 24923 6 1473444 ... 0.001316 24633 6 1361123 ... 0.000729 19172 6 869609 ... 0.001057 13640 6 1149871 ... 0.000742 21031 6 917864 ... 0.000732 19164 6 941067 ... 0.000755 18912 6 918681 ... 0.000851 18290 6 550914 ... 0.000925 16043 6 525460 ... 0.000926 15658 4 310 ... 0.006733 0 6 1183912 ... 0.001347 23240 6 17817 ... 0.002002 34 6 365030 ... 0.001003 13817 6 1418874 ... 0.001279 23866 6 2711845 ... 0.000944 22739 6 1259470 ... 0.001286 24115 6 1243623 ... 0.000695 18385 6 1683884 ... 0.000645 16205 6 1428838 ... 0.001288 24564 6 834130 ... 0.000818 15655 6 858064 ... 0.000823 15874 6 932110 ... 0.000811 16339 6 886053 ... 0.000864 16360 6 948157 ... 0.000805 16698 6 1073621 ... 0.000773 17196 6 1011888 ... 0.000839 16683 6 922077 ... 0.000765 17409 6 1176777 ... 0.000780 19721 6 1053066 ... 0.000800 17061 6 1729937 ... 0.000677 19632 6 1451497 ... 0.000733 19625 6 1063373 ... 0.001254 26385 6 802565 ... 0.001354 19033 6 1045684 ... 0.000664 11837 6 967107 ... 0.000736 13231 6 1732321 ... 0.000678 29760 6 2387291 ... 0.000724 31955 6 1876572 ... 0.000741 29990 6 1408856 ... 0.000762 29226 6 788874 ... 0.000820 22336 6 526374 ... 0.000819 11059 6 917317 ... 0.000767 23787 6 1477052 ... 0.000815 20841 6 1267207 ... 0.000849 21714 6 1019304 ... 0.000841 19794 6 1318178 ... 0.000820 21688

[140 rows x 8 columns]

Full stats files

step 1: ./VC4_fastqs/s1_demultiplex_stats.txt step 2: ./dataset_1_edits/s2_rawedit_stats.txt step 3: ./dataset_1_clust_0.85/s3_cluster_stats.txt step 4: ./dataset_1_clust_0.85/s4_joint_estimate.txt step 5: ./dataset_1_consens/s5_consens_stats.txt step 6: ./dataset_1_across/s6_cluster_stats.txt step 7: None

isaacovercast commented 5 years ago

Hmm, well first thing is I see there are a bunch of samples in there which are essentially failed. Most samples have > 10k reads_consens, but there are a handful with < 100 reads_consens. I would branch and remove all the bad samples as a first pass. Could be these failed samples are messing something up.

ghost commented 5 years ago

Yep! I made some tests without them, but the problem remains. I will have a better look but...

isaacovercast commented 5 years ago

params file

isaacovercast commented 4 years ago

Current status? Can you please update to ipyrad v.0.9 and try again?

isaacovercast commented 4 years ago

Closing. Stale.