SBC-Utrecht / PyTom

PyTom software for cryo-tomogram processing
GNU General Public License v2.0
33 stars 8 forks source link

template matching result filtering #62

Open alncat opened 3 months ago

alncat commented 3 months ago

Dear PyTom developers:

Thank you very much for developing such an amazing tomography processing framework! The template matching result from PyTom is very helpful. However, it is necessary to filter template matching result since it may contains lots of unrelated particles. I have recently developed a tool OPUS-TOMO (link: https://github.com/alncat/opusTomo) which can perform this task. I was wondering if we can team up for a tutorial about filtering the template matching result from PyTom using OPUS-TOMO.

Best Regards, Zhenwei

sroet commented 3 months ago

Hey @alncat (or do you prefer Zhenwei?)

However, it is necessary to filter template matching result since it may contains lots of unrelated particles.

Is this still the case for our improved and independent version: https://github.com/SBC-Utrecht/pytom-match-pick ? (specifically with the tophat filter) Or was there some other reason you're still using the base PyTOM version?

I was wondering if we can team up for a tutorial about filtering the template matching result from PyTom using OPUS-TOMO

I will discuss internally if this is something we are interested in, but in any case, it should probably use pytom-match-pick instead of the regular PyTom.

Some questions that came up immediately when walking through the code/paper: 1) where do you propose we host this tutorial? 2) what is the maintenance plan for OPUS-TOMO? (I assume, like with most academic software, that you're not permanent staff of your research group, please let me know if that assumption is incorrect) 3) How much VRAM is needed to run OPUS-TOMO? (Considering you propose V100 (16 GB of VRAM), while we normally use cards with about 12 GB of VRAM) 4) How long did it take for OPUS-TOMO to run on your example systems? (You mention ~1 hour per epoch, but not how many epochs where needed to converge) 5) Any particular reason why OPUS-TOMO uses such old dependencies? (relion-5 is in open-beta, while you still use 3.0.8, python 3.9 in the environments is outside of the SPEC0 window) 6) Are you planning to do versioned releases? 6b) Are you planning on putting those releases on PyPI or conda-forge?

alncat commented 3 months ago

Thank you very much for providing this detailed checklist! I will draft answers and get back to you shortly!

sroet commented 3 months ago

Hey @alncat ,

I will discuss internally if this is something we are interested in

We are interested in teaming up for a tutorial, in principle. Please do answer the questions above so we can estimate the time requirement from our side

alncat commented 3 months ago

Hi @sroet ,

  1. where do you propose we host this tutorial? I think it would be better to host this tutorial under PyTom wiki pages since this function can be an enhancement to the PyTom template matching.
  2. what is the maintenance plan for OPUS-TOMO? (I assume, like with most academic software, that you're not permanent staff of your research group, please let me know if that assumption is incorrect) I will maintain this software on github. The maintenance will not depend on my position :-).
  3. How much VRAM is needed to run OPUS-TOMO? (Considering you propose V100 (16 GB of VRAM), while we normally use cards with about 12 GB of VRAM) OPUS-TOMO usually takes 8-20 GB of VRAM depends on the size of input. You can use smaller batch size, input size and architecture if VRAM is not enough.
  4. How long did it take for OPUS-TOMO to run on your example systems? (You mention ~1 hour per epoch, but not how many epochs where needed to converge) The runtime typically depends on the size of input, for a full size 128^3 subtomogram, the training time for 26k subtomograms is about 1 hour (actually, the speed is mainly limited by io since the subtomograms are not stored in memory). However, if you choose to downsample the subtomograms by 0.75, the training time will be shortened by half. It usually convergences within 30-50 epochs, but you can obtain decent results within 20 epochs.
    1. Any particular reason why OPUS-TOMO uses such old dependencies? (relion-5 is in open-beta, while you still use 3.0.8, python 3.9 in the environments is outside of the SPEC0 window) Actually, OPUS-TOMO doesn't depends on relion (I mainly used relion to extract subtomograms from tomograms). It works with pyTOM pretty well. It only needs the orientations from template matching or (orientations + translations from subtomogram averaging) for each subtomogram. The legacy relion 3.0.8 can do subtomogram averaging using the 3D subtomograms directly. In contrast, in Relion-5, you need to convert subtomograms to image series, which is not straightforward (Sadly, I never succeeded in using Relion-5 :-(). I did subtomogram averaging after template matching in my preprint. But I found out that OPUS-TOMO can do filtering using the template matching result from pyTOM directly! I also found out that the subtomogram averaging in pyTOM works pretty well. I choose python 3.9 mainly because my development environment is on relatively legacy redhat.
      1. Are you planning to do versioned releases? 6b) Are you planning on putting those releases on PyPI or conda-forge? I have plan for versioned releases later. I am not familiar with PyPI. Since OPUS-TOMO depends on pytorch, I think it might be hard to maintain dependency.
sroet commented 3 months ago

I think it would be better to host this tutorial under PyTom wiki pages since this function can be an enhancement to the PyTom template matching.

That will make the requirements significantly stricter, and I don't think the code would pass our requirements in its current form.

Would you be open to hosting it on your own wiki and us linking to it under https://github.com/SBC-Utrecht/PyTom/wiki/Interfacing-Other-Packages? Or maybe another place you had in mind?

Also, before we do any linking: 1) The tutorial will have to be written 2) have some versioned release (even if it just a 0.1 version) 3) We will have to test that version to see if it works (well enough). 3a) We will probably compare it (internally) against a pytom-match-pick+tomoDRGN/cryoDRGN-ET workflow

Please let me know if you would still be interested considering these point (no worries if you aren't)

I am not familiar with PyPI.

PyPI is where pip install pulls its packages from.

alncat commented 3 months ago

Sounds great~ I will write a detailed tutorial on my own wiki first~The current code in my github repo should be 0.1 version.

alncat commented 3 months ago

I completed an initial draft about wiki. You may check this page: https://github.com/alncat/opusTomo/wiki/Filtering-Template-Matching-Results.

sroet commented 3 months ago

Hey @alncat, thanks for the draft! I will test it out, however I am quite busy the next weeks (and have holidays coming up).

Please ping me again if I haven't responded before the 13th of September!

alncat commented 3 months ago

Hi @sroet , just take your time and enjoy your holidays! The tutorial might contain some errors since I wrote several scripts. You can open an issue on my repo when runs into any trouble. I will also double check during this time and deposits some more intermediate results. I am mainly motivated by the PyTOM wiki. It is really helpful for a beginner in tomography!

alncat commented 2 months ago

Hi @sroet , I also found out that the 3DCTF computed by OPUS-TOMO can be saved and used for the subtomogram alignment in PyTOM. I tested my current implementation using M. pneumoniae 70S ribosome, and improved the resolution of STA from 18.17 A to 14.40 A by incorporating the 3DCTF volume for each subtomogram.