dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

Exception in step 5: ValueError(All chunk dimensions must be positive (all chunk dimensions must be positive))' #369

Closed isaacovercast closed 4 years ago

isaacovercast commented 4 years ago

Reported by @nitishnarula and @richie_hodel in the gitter. Full ipyrad CLI output:

  Step 5: Consensus base/allele calling 
  Mean error  [0.01109 sd=0.00749]
  Mean hetero [0.05067 sd=0.01972]
  [####################] 100% 0:14:35 | calculating depths     
  [####################] 100% 0:14:59 | chunking clusters      
  [####################] 100% 19:27:18 | consens calling        
  [####################] 100% 1:35:12 | indexing alleles       
Exception in step 5: ValueError(All chunk dimensions must be positive (all chunk dimensions must be positive))

  Encountered an Error.
  Message: ValueError: All chunk dimensions must be positive (all chunk dimensions must be positive)

  Parallel connection closed.
---------------------------------------------------------------------------ValueError                                Traceback (most recent call last)<string> in <module>
~/miniconda3/lib/python3.7/site-packages/ipyrad/assemble/consens_se.py in concat_catgs(data, sample, isref)
    808             dtype=np.uint32,
    809             chunks=(optim, maxlen, 4),
--> 810             compression="gzip")
    811         dall = ioh5.create_dataset(
    812             name="nalleles",
~/miniconda3/lib/python3.7/site-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds)
    134 

I have debugged it to the point of figuring out the following:

It looks like one of these variables is getting set to 0: optim, maxlen, or nrows. Not sure how this would arise. Can you wetransfer me a couple of the clustS.gz files from the _clust* directory?

But can't get a grip on why any of those would ever be zero. WTF?!

eaton-lab commented 4 years ago

I haven't seen this in a long time... you did change the chunking code recently, could it be related?

isaacovercast commented 4 years ago

No, it's something else. I'm actively debugging this with 2 people right now actually. I'm on the trail, but it's a weird bug, hard to pin down.

On Wed, Nov 6, 2019 at 12:38 AM Eaton Lab notifications@github.com wrote:

I haven't seen this in a long time... you did change the chunking code recently, could it be related?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/369?email_source=notifications&email_token=ABNSXP6B6JRDQ47FZAM4BM3QSH7VXA5CNFSM4JIX3RLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDEXGAY#issuecomment-550073091, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNSXP5LC5CPJFD3H44TTJ3QSH7VXANCNFSM4JIX3RLA .

isaacovercast commented 4 years ago

I added some error handling to allow for some samples to fail step 5, for example if there are no consensus sequences in the sample then nrows will be 0 and the previous code would crash. Here it passes the error message back up to the parallel function and then just reports failure, rather than killing the entire run. Fixed in df4603d9eeb68bba4f9b204e9416e8e4946095d0