Fix of `gauss_2d_large(seed=63)` -> `NaN`

abukaj commented 5 years ago

kcsd.validation.csd_profile.gauss_2d_large(seed=63) does not return NaN anymore.

repeatUntilValid() decorator has been defined for that purpose.

A simple test for the fix provided in the __main__ section of the module.

Note: The issue has not been solved by fixing distribution of the zs variable in order to provide backward-compatibility.

coveralls commented 5 years ago

Pull Request Test Coverage Report for Build 159

18 of 25 (72.0%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.09%) to 60.271%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
kcsd/validation/csd_profile.py	18	25	72.0%
<!--	Total:	18	25	72.0%	-->

Totals
Change from base Build 154:	0.09%
Covered Lines:	1693
Relevant Lines:	2809

💛 - Coveralls

ccluri commented 5 years ago

Nice workaround! This should apply to 1D and 3D, small and large as well - right?

Although I only tested for seeds between 1-100, maybe the seeds else where have the same issue?

abukaj commented 5 years ago

@ccluri Thanks! I have not checked the other functions but I will do in a moment. A respective commit shall arrive today.

ccluri commented 5 years ago

I would say use something like seed = rstate.randint(2**32 - seed) Just so its not the same seed for all the failed candidates. Over engineering - yes.

abukaj commented 5 years ago

@ccluri The 2**32 argument of rstate.randint() is the upper bound (exclusive) on the random int. Also on the seed parameter of np.random.RandomState constructor.

The actual seed for rstate.randint() is given when the rstate object is created. The idea is to have a 'seedchain' seed[n+1] = randint(2**32, seed=seed[n]).

BTW: I am testing the csd_profile functions with the following code (I hope):

import numpy as np
from kcsd import csd_profile as CSD
import collections
x1d = np.mgrid[0.:1.:10j]
x2d = np.mgrid[0.:1.:10j, 0.:1.:10j]
x3d = np.mgrid[0.:1.:10j, 0.:1.:10j, 0.:1.:10j]
seed = 0
fails = []
fs = [CSD.gauss_1d_mono, CSD.gauss_1d_dipole, CSD.gauss_2d_large, CSD.gauss_2d_small, CSD.gauss_3d_small, CSD.gauss_3d_large]
while True: 
     if seed % 10000 == 0: 
         print(f'Fails for seed < {seed}:') 
         for name, n in collections.Counter(a for a,_ in fails).items(): 
             print(f' {name} : {n}') 

     for f in fs: 
         try: 
             res = f(x3d, seed) 
         except ValueError: 
             try: 
                 res = f(x2d, seed) 
             except ValueError: 
                 res = f(x1d, seed) 
         if not np.isfinite(res).any(): 
             fails.append((f.__name__, seed)) 
             #print(f"{f.__name__}({seed})") 

     seed += 1

So far (for 0 <= seed < 600000) only gauss_2d_large() has failed (1158 times).

ccluri commented 5 years ago

Okay, let me know when you are content with the update - I am happy to merge when you are.

abukaj commented 5 years ago

@ccluri I am happy with gauss_2d_large() implementation in f59c426c56b2aa597d4db030e698cbfc473e6e62. Did you say something about over engineering? ;) But let's wait till tomorrow for results for other non-decorated functions. So far seeds < 4840000 caused no problem.

abukaj commented 5 years ago

@ccluri I have tested 46600000 seeds. No NaNs returned by other functions.

Neuroinflab / kCSD-python

Fix of `gauss_2d_large(seed=63)` -> `NaN` #76

Pull Request Test Coverage Report for Build 159

💛 - Coveralls