ComputationalCryoEM / ASPIRE-Python

Algorithms for Single Particle Reconstruction
http://spr.math.princeton.edu
GNU General Public License v3.0
46 stars 21 forks source link

Use class structures for non uniform fft calls #166

Closed garrettwrong closed 3 years ago

garrettwrong commented 4 years ago

We will not see this error in the default checkout when running on Tiger. However, because we are targeting a release for other people to use, including other developers, we need to take care about injecting dependencies deep in the code that are not consistent with high level structures, configuration and goals.

For instance, if user does not have one of our three nufft packages, it should not totally break the ability to even run our unit tests, no matter how strongly we feel about a particular package being "the one", or working for me. That is not the nature of usable software, which is what we are trying to work towards. This is particularly true when effort has been spent constructing working solutions to a problem. This point is also supported by the fact that the package it breaks on is not totally portable, and so we can not at this time assume it will work for users out of the box. This is why there are options and the class abstraction...

While I made this point during review process to no avail, I was very actually saddened to see such an error already within only two days.

For now I will workaround, but this change we smashed in the day before an event only adds to the complication of supporting users. I hope we can learn from it.

(aspire_scratch) ➜  ASPIRE-Python git:(hackathon_2020) pytest tests |& tee -a ../tests.log
============================= test session starts ==============================
platform linux -- Python 3.6.10, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
rootdir: /gpfs/wolf/gen137/scratch/gwright/ASPIRE-Python, inifile: pytest.ini
plugins: cov-2.9.0
collected 181 items / 1 error / 1 deselected / 179 selected

==================================== ERRORS ====================================
_________________ ERROR collecting tests/test_PolarBasis2D.py __________________
ImportError while importing test module '/gpfs/wolf/gen137/scratch/gwright/ASPIRE-Python/tests/test_PolarBasis2D.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/test_PolarBasis2D.py:4: in <module>
    from aspire.basis.polar_2d import PolarBasis2D
src/aspire/basis/polar_2d.py:3: in <module>
    import finufftpy
E   ModuleNotFoundError: No module named 'finufftpy'
=========================== short test summary info ============================
ERROR tests/test_PolarBasis2D.py
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
======================== 1 deselected, 1 error in 4.24s ========================
junchaoxia commented 4 years ago

This finufftpy module used to be installed automatically. We should have more details on installation if we decide to use it in the other way. If the new interface is the same thing I can fit it quickly.

garrettwrong commented 4 years ago

Yes, I agree, and that is still the case (installed automatically for master/develop). I think that is why the issue displayed itself in the hackathon branch. To be clear, I am sure you did not mean to introduce an incompatible change into the code, and force merged it in anyway after I pointed it out. I expect everything looked good to you; this is where our different experience and perspectives come into play, and why it was concerning to me. I am merely reporting this was the effect, mainly for consideration in the future, and also so this is in the fix list. I'm not worried it is complicated (we agree it wasn't/isn't). Its all good.

I guess one point, is that, I think this should make certain tests fail or skip, not break test collection wholesale. For a user they would just feel everything is broken (despite that not being the case). If we used the structure, I think just certain tests would fail. They might actually pass if another backend was available (python or cuda)...

Another point is that, such install details, very detailed, did exist for this particular process, and they were actually invalidated by this change. The entire install process of the hackathon branch (which we tested was working together last week) went fine up to attempting to run the unit test suite due to the recent merge. While this would not have impacted you, or my initial install, since we were already dually installed, it could effect others. Luckily I think everyone is installed at least once... I am just going through again on scratch this morning in case I need to help our team...

junchaoxia commented 4 years ago

The original Polar2D class was merged after the approving of review and passing all unit tests. I spend some time to downgrade the many framework of finufft to be consistent with FINufftPlan class.

garrettwrong commented 4 years ago

Yes, I see that, I am looking now, thanks. We should have many optimization argument working for all Plans very soon after the Hackathon. This will help to convert them all at once, the same way. This code looks better to me, I will check it out and test it first thing in the morning.

I am a little confused why the unit test reference had to change, but I left a comment there, easier talk about that in place.

garrettwrong commented 4 years ago

I know there are a few other places where this happened (some unrelated), so I will leave the issue open (and I am happy to fix the rest of them). If we can ensure any new code comes over using the class structures, that will be a big help.

garrettwrong commented 4 years ago

Not sure if the code is used atm, but the remaining cases of this (outside the ignored em_classavg folder) is src/aspire/aspire/abinitio.py. This should be easy to cleanup.

garrettwrong commented 3 years ago

Hrmm, I tried working this issue to close it (after 231 where I covered the other cases) but I'm having trouble testing it... The following command (constructed following the help menu) yields an error.

python -m aspire abinitio tutorials/data/clean70SRibosome_vol_65p.mrc
Traceback (most recent call last):
  File "/home/garrett/Dropbox/PU/aspire/ASPIRE-Python.col_major_1/src/aspire/exceptions.py", line 60, in handle_exception
    raise exc_value
  File "/home/garrett/anaconda3/envs/aspire_col_major_1/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/garrett/anaconda3/envs/aspire_col_major_1/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/garrett/Dropbox/PU/aspire/ASPIRE-Python.col_major_1/src/aspire/__main__.py", line 20, in <module>
    main.main(prog_name='aspire')
  File "/home/garrett/anaconda3/envs/aspire_col_major_1/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/garrett/anaconda3/envs/aspire_col_major_1/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/garrett/anaconda3/envs/aspire_col_major_1/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/garrett/anaconda3/envs/aspire_col_major_1/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/garrett/Dropbox/PU/aspire/ASPIRE-Python.col_major_1/src/aspire/commands/aspire.py", line 47, in abinitio_cmd
    output_stack = Abinitio.cryo_abinitio_c1_worker(stack)
  File "/home/garrett/Dropbox/PU/aspire/ASPIRE-Python.col_major_1/src/aspire/aspire/abinitio.py", line 89, in cryo_abinitio_c1_worker
    s = cryo_syncmatrix_vote(clstack, AbinitioConfig.n_theta)
  File "/home/garrett/Dropbox/PU/aspire/ASPIRE-Python.col_major_1/src/aspire/aspire/abinitio.py", line 792, in cryo_syncmatrix_vote
    stmp[:, :, j] = cryo_syncmatrix_ij_vote(clmatrix, i, j, np.arange(k), l, rots_ref, is_perturbed)
  File "/home/garrett/Dropbox/PU/aspire/ASPIRE-Python.col_major_1/src/aspire/aspire/abinitio.py", line 805, in cryo_syncmatrix_ij_vote
    good_k, _, _ = cryo_vote_ij(clmatrix, l, i, j, k, rots_ref, is_perturbed)
  File "/home/garrett/Dropbox/PU/aspire/ASPIRE-Python.col_major_1/src/aspire/aspire/abinitio.py", line 954, in cryo_vote_ij
    return good_k.astype('int'), peakh, alpha
AttributeError: 'list' object has no attribute 'astype'

Not sure if I am calling it wrong, or it was brought in broken.... (tried develop and master).

@janden , @junchaoxia would you know a way to invoke this abinitio code for testing purposes? Thanks

junchaoxia commented 3 years ago

As far as I know, all modules that were merged into new framework should use the class structures of Nufft. Those modules include preprocess steps, basis classes of image expansion, Cov2D denoising, orientation estimation and Co3D analysis. The many framework of passing data was added recently and it might be not used if PR# 231 did not go through all of them. The old Python code under aspire/aspire is not using the class structures but I think we probably will not change them since they will be removed in future. The test you tried to run is building an ab initio 3d volume from 2D classification results using old Python code. The input has to be 2D images instead of 3D volume. The whole procedure from Yoel is listed as below (using removed old Python repo):

  1. conda env create -n aspire -f /path/to/environment.yml Create an environment to run ASIPRE

  2. conda activate aspire Activate the environment

  3. .cd

  4. head -n 10000 /data/yoelsh/datasets/10028/data/Particles/shiny_2sets.star > particles.star This command needs to be adapted for a demo. I downloaded the EMPIAR-10028 dataset, and kept only the first 10,000 particles. We can create a "package" containing the required particles, as downloading the entire data set is not practical as part of a demo.

  5. ./aspire.py preprocess particles.star Preprocess the data

  6. ./aspire.py classify preprocessed_stack.mrcs Generate class averages

  7. ./aspire.py abinitio --num_images 500 ordered_averages.mrcs Generate abinitio model.

To rerun them using the aspire/aspire code of new repo, we need to replace ./aspire.py with python -m aspire. I went through some of them when I worked on the orientation estimation codes as below:

python -m aspire classify /scratch/gpfs/junchaox/cryoem-data/10028L65/whitened.mrcs python -m aspire abinitio whitened_classified_subset.mrcs

garrettwrong commented 3 years ago

Yes, recent code has been converted to use the classes, including the many transform, thanks, that is correct, I'm just trying to close out the last remaining hardcoded call to finufftpy (which is going away once their new wheels are pushed).

Thanks for the details. I will try to reproduce something similar. I was hoping one of the many datafiles we ship with our code might be useful for a one line sanity test, but I guess its okay jump through some hoops to find out if that code was actually working before I started with it. Will play it by ear I guess.

garrettwrong commented 3 years ago

Following discussion in our team meeting, holding off on this. (Probably will be deprecated).

garrettwrong commented 3 years ago

Code in question has been removed by #283 . Closing