Closed ctargon closed 7 years ago
Some notes on what Colin and I did today:
icasig
in the MATLAB code because the C code didn't implement the whole formula; however, we did not see an increase in accuracy in the test run that we used (1-5 removed from each class, 27.5% in both cases)remmean
twice: once in run_ica
, in which it removes the mean image, and once again in fastica
, in which it removes the mean "row" rather than the mean image because mixedsig = X'
Some things that I plan to do once I have time:
ica.c
to the matrix librarytemp_PCA
with the main PCA functionOnce I have done my clean-up, we should run cross-validation so that we can see how the accuracy of the C code changes as the training set shrinks.
Meanwhile, it might be useful for some people to play around with the parameters in the MATLAB ICA code that were not included in the C code and document their findings so that we know which parameters are worth keeping.
Okay, after a few debug sessions throughout last night and today, I was able to fix the issue we had with NaNs in our matrices. A few points:
PCA_alt
function in ica.c
was returning an extra eigenvalue with value 0, because an N-by-N symmetric matrix has at most N-1 eigenvalues greater than 0. The eigenvalues are later diagonalized and inverted, which introduced some very large numbers. I changed PCA_alt
to remove the extra eigenvalue, but I may modify m_eigen
in the future to automatically remove these zero columns. Anyway, this change got rid of the NaNs.fpica
, the inner loop actually breaks when the w
vector converges, but this break
statement was never added to the C code because the MATLAB indentation is so bad. Anyway, I added the break
and fixed up the convergence code so our C code is now considerably faster.In summary, the result of my changes is that the ICA C code seems to produce very similar results to the MATLAB code, including intermediate values. However, even with the break
statement, the C code is still significantly slower. I would assume that many of the operations in the MATLAB code are multi-threaded, but there may be other differences. We'll need to add timing information to the ICA code so that we can identify any bottlenecks.
Converting the matrix library to single precision seems to have improved the performance of ICA somewhat.
Accuracy is on par with Matlab
ICA C implementation is compiling, running, and producing accuracies between ~25%-70%. There are plenty of optimizations that can be incorporated and perhaps a missing line or two from the Matlab. Could use another pair of eyes to go through it and see if I missed anything.