Closed githubpsyche closed 4 years ago
Start time 6/22 20:32
********** START OF MESSAGES **********
0 trials are dropped since they are regarded as outliers
********** END OF MESSAGES **********
Betas: 0, 1
EM Iteration: 0
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-5-2bc4507d9022> in <module>
1 # run tests only when is main file!
2 if __name__ == '__main__':
----> 3 test_importance_sampler()
<ipython-input-2-060729857a3d> in test_importance_sampler()
18
19 # generate output
---> 20 importance_sampler(python_data, python_analysis_settings)
21 eng.importance_sampler(matlab_data, matlab_analysis_settings, nargout=0)
22
<ipython-input-1-e544593ccecb> in importance_sampler(***failed resolving arguments***)
71 # This is where we use the chunking trick II
72 for ptl_idx in range(np.shape(ana_opt['ptl_chunk_idx'])[0]):
---> 73 output_struct = family_of_curves(ana_opt['curve_type'], 'compute_likelihood', ana_opt['net_effect_clusters'], ana_opt['ptl_chunk_idx']['ptl_idx, 3'],
74 param[ana_opt['ptl_chunk_idx'][ptl_idx, 1]:ana_opt['ptl_chunk_idx'][ptl_idx, 2], :], hold_betas, preprocessed_data,
75 ana_opt['distribution'], ana_opt['dist_specific_params'], ana_opt['data_matrix_columns'])
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
Was result of enclosing paired indices within quotations: ['ptl_idx, 3']
.
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-10-2bc4507d9022> in <module>
1 # run tests only when is main file!
2 if __name__ == '__main__':
----> 3 test_importance_sampler()
<ipython-input-7-060729857a3d> in test_importance_sampler()
18
19 # generate output
---> 20 importance_sampler(python_data, python_analysis_settings)
21 eng.importance_sampler(matlab_data, matlab_analysis_settings, nargout=0)
22
<ipython-input-6-f2cf72f6c3a8> in importance_sampler(***failed resolving arguments***)
71 # This is where we use the chunking trick II
72 for ptl_idx in range(np.shape(ana_opt['ptl_chunk_idx'])[0]):
---> 73 output_struct = family_of_curves(ana_opt['curve_type'], 'compute_likelihood', ana_opt['net_effect_clusters'], ana_opt['ptl_chunk_idx'][ptl_idx, 3],
74 param[ana_opt['ptl_chunk_idx'][ptl_idx, 1]:ana_opt['ptl_chunk_idx'][ptl_idx, 2], :], hold_betas, preprocessed_data,
75 ana_opt['distribution'], ana_opt['dist_specific_params'], ana_opt['data_matrix_columns'])
IndexError: index 3 is out of bounds for axis 1 with size 3
ana_opt['ptl_chunk_idx']
has dimensions 2x3; I haven't adjusted indexing appropriately. The whole codebase could be a minefield!
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-20-2bc4507d9022> in <module>
1 # run tests only when is main file!
2 if __name__ == '__main__':
----> 3 test_importance_sampler()
<ipython-input-17-060729857a3d> in test_importance_sampler()
18
19 # generate output
---> 20 importance_sampler(python_data, python_analysis_settings)
21 eng.importance_sampler(matlab_data, matlab_analysis_settings, nargout=0)
22
<ipython-input-16-dc0402fd2a11> in importance_sampler(***failed resolving arguments***)
74 for ptl_idx in range(np.shape(ana_opt['ptl_chunk_idx'])[0]):
75 output_struct = family_of_curves(ana_opt['curve_type'], 'compute_likelihood', ana_opt['net_effect_clusters'], ana_opt['ptl_chunk_idx'][ptl_idx, 2],
---> 76 param[ana_opt['ptl_chunk_idx'][ptl_idx, 0]:ana_opt['ptl_chunk_idx'][ptl_idx, 1], :], hold_betas, preprocessed_data,
77 ana_opt['distribution'], ana_opt['dist_specific_params'], ana_opt['data_matrix_columns'])
78
TypeError: slice indices must be integers or None or have an __index__ method
Is definitely an issue with how I'm slicing param
. Is it a numpy object? Yes. The trouble is that my indices are floats instead of integers; I can easily convert.
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-45-2bc4507d9022> in <module>
1 # run tests only when is main file!
2 if __name__ == '__main__':
----> 3 test_importance_sampler()
<ipython-input-42-060729857a3d> in test_importance_sampler()
18
19 # generate output
---> 20 importance_sampler(python_data, python_analysis_settings)
21 eng.importance_sampler(matlab_data, matlab_analysis_settings, nargout=0)
22
<ipython-input-41-2046446b7791> in importance_sampler(***failed resolving arguments***)
49 prev_iter_curve_param = np.full((ana_opt['particles'], family_of_curves(ana_opt['curve_type'], 'get_nParams')), np.nan) # Matrix to hold the previous iteration curve parameters
50 w = np.full((ana_opt['particles']), np.nan) # Vector to hold normalized weights
---> 51 net_effects = np.full((len(ana_opt['net_effect_clusters']), ana_opt['particles']), np.nan) # Matrix to hold the predictor variables (taking net effects if relevant) over all particles
52 dependent_var = [] # This vector cannot be initialized in advance since we don't know the length of this vector (dropping outliers)
53
~\Miniconda3\lib\site-packages\numpy\core\numeric.py in full(shape, fill_value, dtype, order)
312 if dtype is None:
313 dtype = array(fill_value).dtype
--> 314 a = empty(shape, dtype, order)
315 multiarray.copyto(a, fill_value, casting='unsafe')
316 return a
MemoryError: Unable to allocate 1.36 GiB for an array with shape (1824, 100000) and data type float64
What a pain. Can the shadow handle it?
Shadow apparently can; this requires some documentation, though. First time concretely justifying having shadow!
Start time 6/26 1:42
********** START OF MESSAGES **********
0 trials are dropped since they are regarded as outliers
********** END OF MESSAGES **********
Betas: 0, 1
EM Iteration: 0
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-51-2bc4507d9022> in <module>
1 # run tests only when is main file!
2 if __name__ == '__main__':
----> 3 test_importance_sampler()
<ipython-input-48-060729857a3d> in test_importance_sampler()
18
19 # generate output
---> 20 importance_sampler(python_data, python_analysis_settings)
21 eng.importance_sampler(matlab_data, matlab_analysis_settings, nargout=0)
22
<ipython-input-47-1fa42ebdda05> in importance_sampler(***failed resolving arguments***)
97 optimizing_function = family_of_distributions(ana_opt['distribution'], 'fminunc_both_betas', w, net_effects, dependent_var, ana_opt['dist_specific_params'])
98
---> 99 result = optimize.minimize(optimizing_function, hold_betas, options={'disp':True})
100 hold_betas = result.x
101 fval = result.fun
~\AppData\Roaming\Python\Python37\site-packages\scipy\optimize\_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
602 return _minimize_cg(fun, x0, args, jac, callback, **options)
603 elif meth == 'bfgs':
--> 604 return _minimize_bfgs(fun, x0, args, jac, callback, **options)
605 elif meth == 'newton-cg':
606 return _minimize_newtoncg(fun, x0, args, jac, hess, hessp, callback,
~\AppData\Roaming\Python\Python37\site-packages\scipy\optimize\optimize.py in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, **unknown_options)
1007 else:
1008 grad_calls, myfprime = wrap_function(fprime, args)
-> 1009 gfk = myfprime(x0)
1010 k = 0
1011 N = len(x0)
~\AppData\Roaming\Python\Python37\site-packages\scipy\optimize\optimize.py in function_wrapper(*wrapper_args)
325 def function_wrapper(*wrapper_args):
326 ncalls[0] += 1
--> 327 return function(*(wrapper_args + args))
328
329 return ncalls, function_wrapper
~\AppData\Roaming\Python\Python37\site-packages\scipy\optimize\optimize.py in approx_fprime(xk, f, epsilon, *args)
763
764 """
--> 765 return _approx_fprime_helper(xk, f, epsilon, args=args)
766
767
~\AppData\Roaming\Python\Python37\site-packages\scipy\optimize\optimize.py in _approx_fprime_helper(xk, f, epsilon, args, f0)
695 ei[k] = 1.0
696 d = epsilon * ei
--> 697 df = (f(*((xk + d,) + args)) - f0) / d[k]
698 if not np.isscalar(df):
699 try:
TypeError: unsupported operand type(s) for -: 'tuple' and 'tuple'
Incredibly complex error!
First need to confirm that input functions are solid. Next must determine if optimization function is equivalent. Then the bug should be obvious.
It does seem equivalent for this case. Apparently the default minimization function for fminunc is quasi-newton
. There's no quas-newton
option for the python optimize
. It looks like it could be BFGS
based on this page though.
This wikipedia article explicitly links quasi-newton with a few implementations:
the SR1 formula (for "symmetric rank-one"), the BHHH method, the widespread BFGS method (suggested independently by Broyden, Fletcher, Goldfarb, and Shanno, in 1970), and its low-memory extension L-BFGS.
I clearly want to use: minimize(method=’L-BFGS-B’)
. And that happens to be the default!
It does seem that a gradient is specified - that's the second output of fminunc_bernoulli_both
. It doesn't seem that the python version supports this kind of shared output thing. I need a separate function to generate the gradient.
Going to close this and start fresh with #29
Must develop and pass a test for each option.