Multiprocessing amendments

LucasCampos commented 3 years ago

Now that I think about it, though, I don't think we need the ifs (or the Y-matrix) at all. As we have, by default, that D=None, if we just past D to the function it should automatically go through the best path. So, with that in mind, could you test if doing the following works?

if is_symmetric:
    eigenvalues, eigenvectors = spl.eigh(Q, D, check_finite=False)
else:
    eigenvalues, eigenvectors = spl.eig(Q, D, check_finite=False)

ChristineFarrugia commented 3 years ago

Hi Lucas,

This has been a long time coming - apologies for that. Here is a list of the changes/additions I made to the code:

Introduced the get_fiedler_eigenpair function. This function computes the full set of eigenvalues and eigenvectors, but only normalizes the Fiedler vector, making computation time much shorter.
Modified the solve_general_eigenproblem function to make it more compact. All eigenvectors are normalized with respect to the metric induced by D, BUT normalization only has to be carried out explicitly if they have been computed by spl.eig (and even then, only in cases when D is not None). I included the line eigenvalues = eigenvalues[sort_eigen]. Eigenvalue normalization now takes place in numerics.py (this is necessary, since get_fiedler_eigenpair only passes the second smallest eigenvalue to vb_index.py).
Fixed a few typos.
Made changes to some of the comments to improve clarity. I briefly explained why the neighbourhood matrix is reshaped [numerics.py, lines 263/4 in my amended version].
Introduced exception handling for cases when the time series are too short. So the code no longer reads neighborhood_scaled = neighborhood when neighborhood.shape[1] < 3 (lines 215/6 in original code). An error is returned instead.
Implemented your suggestion regarding the way arguments are passed to the spl.eigh function. More specifically, the code no longer distinguishes between the cases with D being None or not; D is always passed to spl.eigh (or spl.eig), and defaults to D = None.
Modified spectral_reorder so that it returns the Fiedler eigenpair in place of the full set of eigenvalues and eigenvectors. Spectral_reorder now makes use of get_fiedler_eigenpair, rather than solve_general_eigenproblem. In the case of the normalized laplacian, this means that the code reads eigenvector = spl.solve(T, eigenvector) and not eigenvectors = spl.solve(T, eigenvectors).
Normalized the Fiedler vector returned by passing the Random Walk Laplacian to get_fiedler_eigenpair.

Cheers, Christine

On Tue, 8 Jun 2021 at 16:35, Lucas Campos @.***> wrote:

Just checked the code, and it looks good! Would nice to have a proper summary of why the changes happened, though. In my understanding they were

Creating the get_fiedler_eigenpair function

Some better error handling in the parallel computations

Numerous typo fixing/clarification fixes

Altogether, I really like this new code! I suppose it has been tested already, right?

I have a problem/question with one specific part, though. In lines 71-75 of numerics.py, you changed the call type of the function, and now the system always tries to solve the general eigenproblem. It is possible that we will see a performance decrease with that, as I would expect LAPACK to have distinct algorithms for the usual and for the general eigenproblem. I am not sure if this is the case, and I would need to take a look at the code, which I will do soon.

Other than that, great work!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/VBIndex/py_vb_toolbox/pull/35#issuecomment-856823286, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT3ARURD5POE5QICOKIOD3DTRYTEHANCNFSM46JN6LPQ .

claudebajada commented 3 years ago

@LucasCampos - I am happy with this. Once you review I will merge.

LucasCampos commented 3 years ago

Just to make a well formatted version of the comments above.

This has been a long time coming - apologies for that. Here is a list of the changes/additions I made to the code:

Introduced the get_fiedler_eigenpair function.
This function computes the full set of eigenvalues and eigenvectors, but only normalizes the Fiedler vector, making computation time much shorter.
Modified the solve_general_eigenproblem function to make it more compact. All eigenvectors are normalized with respect to the metric induced by D, BUT normalization only has to be carried out explicitly if they have been computed by spl.eig (and even then, only in cases when D is not None). I included the line eigenvalues = eigenvalues[sort_eigen]. Eigenvalue normalization now takes place in numerics.py (this is necessary, since get_fiedler_eigenpair only passes the second smallest eigenvalue to vb_index.py).
Fixed a few typos.
Made changes to some of the comments to improve clarity. I briefly explained why the neighbourhood matrix is reshaped [numerics.py, lines 263/4 in my amended version].
Introduced exception handling for cases when the time series are too short. So the code no longer reads neighborhood_scaled = neighborhood when neighborhood.shape[1] < 3 (lines 215/6 in original code). An error is returned instead.
Implemented your suggestion regarding the way arguments are passed to the spl.eigh function. More specifically, the code no longer distinguishes between the cases with D being None or not; D is always passed to spl.eigh (or spl.eig), and defaults to D = None.
Modified spectral_reorder so that it returns the Fiedler eigenpair in place of the full set of eigenvalues and eigenvectors. spectral_reorder now makes use of get_fiedler_eigenpair, rather than solve_general_eigenproblem. In the case of the normalized laplacian, this means that the code reads eigenvector = spl.solve(T, eigenvector) and not eigenvectors = spl.solve(T, eigenvectors).
Normalized the Fiedler vector returned by passing the Random Walk Laplacian to get_fiedler_eigenpair.

LucasCampos commented 3 years ago

Hey!

I really like the changes. I am ready to accept it, as long as you tell me that you tested that the changes in the 7th items were tested and did not change the results. Could you confirm that so that I can green light the request?

ChristineFarrugia commented 3 years ago

Hi Lucas,

I'd tested this on two data sets, but I've spoken to Claude and we've agreed to run a few more tests just to be sure. Will let you know once I have the results.

Christine

On Wed, 14 Jul 2021 at 19:33, Lucas Campos @.***> wrote:

Hey!

I really like the changes. I am ready to accept it, as long as you tell me that you tested that the changes in the 7th items were tested and did not change the results. Could you confirm that so that I can green light the request?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/VBIndex/py_vb_toolbox/pull/35#issuecomment-880080035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT3ARUT6GLWOARVCDXKMXPDTXXDAHANCNFSM46JN6LPQ .

LucasCampos commented 3 years ago

Any news here?

ChristineFarrugia commented 3 years ago

Getting there, Lucas. Sorry for the delay... I have loads of tests to run currently!

On Mon, 19 Jul 2021 at 14:06, Lucas Campos @.***> wrote:

Any news here?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/VBIndex/py_vb_toolbox/pull/35#issuecomment-882494171, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT3ARURUSZVQMX3ILCQ42ULTYQINLANCNFSM46JN6LPQ .

ChristineFarrugia commented 3 years ago

Hi Claude, Lucas,

I am sending some images I obtained when testing the Mutiprocessing_Amendments pull request. In all cases, I ran the exact same command twice, one for each version of the toolbox, then subtracted the two outputs using the wb_command -metric-math and generated an image for the result. In my opinion it is now quite safe to go ahead with the merge, but I'd be happy to run any more tests you think necessary.

The title of each image gives the specs of the data used to generate it (let me know if anything is not clear). Surface resolution is 32k in all cases. 101309.L.REST1_ribbon_difference_fb.png https://drive.google.com/file/d/1zZHI1nNZoulYQdOqwl4b4F7nPvZS1LOU/view?usp=drive_web 102311.R.REST1_enclosing_difference_searchlight... https://drive.google.com/file/d/1g61NUfioe7pEhGtL7D7gPSoC4TxrKSIZ/view?usp=drive_web 103111.R.REST2_cubic_difference_searchlight.png https://drive.google.com/file/d/18n054ZNJsu3gGNkCMbD4cHbfgJmNtBYa/view?usp=drive_web 106016.L.REST2_trilinear_difference_fb.png https://drive.google.com/file/d/1JXOa-ffJfpk5m2TSyDvawg19LpZQpi3o/view?usp=drive_web 124220.L.TASK_LR_difference_hybrid.png https://drive.google.com/file/d/18RvDbCQkZOt-l8h0hjq6eehQFiroanAN/view?usp=drive_web 150625.R.REST2_RL_difference_hybrid.png https://drive.google.com/file/d/146tLDPxOj6-3JfGKbiT1IbRiN9A3eIg2/view?usp=drive_web Have a good weekend! Christine

claudebajada commented 3 years ago

The two "full brains" differences seem to be non zero with a distinct pattern. I think this may be numerical error but to be sure could you send an image of the two input images prior to the difference calculation?

ChristineFarrugia commented 3 years ago

Hi Claude,

I will have to send them on Monday as the workstation is switched off this weekend due to the power cut.

Christine

claudebajada commented 3 years ago

The individual results look good to me. I am going to merge.

LucasCampos commented 3 years ago

Fine by me as well.

claudebajada commented 3 years ago

@all-contributors please add @ChristineFarrugia for code

allcontributors[bot] commented 3 years ago

@claudebajada

I've put up a pull request to add @ChristineFarrugia! :tada:

VBIndex / py_vb_toolbox

Multiprocessing amendments #35