tbepler / topaz

Pipeline for particle picking in cryo-electron microscopy images using convolutional neural networks trained from positive and unlabeled examples. Also featuring micrograph and tomogram denoising with DNNs.
GNU General Public License v3.0
172 stars 64 forks source link

Quality of denoising #66

Closed dgvjay closed 4 years ago

dgvjay commented 4 years ago

Hello authors

Many congratulations on producing this impressive denoising pipeline.

I am running into some issues regarding the quality of my denoising, which has not been nearly as good. I am attaching a screenshot of the tomographic slice without denoising and after denoising. There may be something that I am doing wrong.

I am using the Topaz command as suggested in this wonderfully written website: https://emgweb.nysbc.org/topaz.html So my command for denoising is as follows: topaz denoise3d Tomogram_full.rec --model unet-3d-20a --device -2 --patch-size 96 --patch-padding 48 --output ./

where Tomogram_full.rec is the 4X binned tomogram in 16 bit. I am using unet-3d-20a because it is a 4X binned tomogram. Rest all the parameters are the same as suggested.

Please note that I am not applying a gaussian filter after denoising. This filter, I noticed, is a new feature of the topaz. I still have the older version of the topaz that does not seem to support the --gaussian flag. Initially, I thought that maybe your trained model is not 'specific' enough for our tomograms. So I trained and created my own model. The results with my model are also not as good.

What do you think about my results? Is there something wrong with the way I am doing the denoising?

Thanks and cheers, Digvijay

4X Binned Tomogram_Afte denoising with unet-3d-20a 4X Binned Tomogram_Before Denoising

alexjnoble commented 4 years ago

Hi Digvijay,

Which screenshot is before denoising and which is after?

How did you train your own model? By splitting the tilt-series into even/odd frames, reconstructing, then using those halves as denoising training input?

What are you using to reconstruct? Have you applied dose weighting?

Best, -Alex

dgvjay commented 4 years ago

Hi Alex

Thanks for the quick response. The answers to your questions:

[1] The top screenshot is after denoising and the bottom one is without denoising. This denoising was done using the unet-3d-20a, not my own model. [2] Yes, to train my own model. I motion-corrected the odd and even frames completely independently. This led to the generation of two motion-corrected stacks. The _even.st and the _odd.st. These stacks are not dose-weighted. [3] The two tomograms were then independently constructed using the _event.st and the _odd.st stacks in IMOD using the patch-tracking. This led to the generation of the two tomograms: _even_full.rec and the _odd_full_rec. These tomograms were put in the two different directories and given the same name since Topaz likes the two halves tomograms to have the same name. [4]. Self-training using the single tomogram, with 8 GPUs, was quite fast. It took only 1-2 minutes and the error of the training was <1-2. [5]. The example I showed above is a tomogram of a bacterial lamella containing a lot of ribosomes. Yes, this tomogram, on which the unet=3d-20a, was applied had been generated from dose-weighted stacks. Do you think the dose-weighting prevented good denoising?

Thanks and cheers, Digvijay

alexjnoble commented 4 years ago

Hi Digvijay,

By looking at the two tomogram slice images you provided, that is about what the expected improvement should be. If you zoom in, you can see that the speckles characteristic of noise are largely removed, and the contrast is improved. Cells are still very crowded environments, which means that the remaining signal is real signal in the tomograms. For example, the features inside the vertical bilayer on the left become unambiguous after noise removal and contrast enhancement. Areas outside of the tomogram still have features because those features are present in the weighted back projection due to imperfect imaging conditions and imperfect alignment. So from what I can tell, the pre-trained 20a model works as expected here.

With regards to making data for a self-trained model: The only issue I see in your workflow is that you aligned each half-frame tilt-series independently. You should align the tilt-series with all the frames, then independently make two tilt-series stacks with even and odd frames, then use the full-frame alignment to reconstruct the half-frame tilt-series to use in training a model. You need to do it this way because Noisie2Nosie is expecting two independently collected tomograms of the same field of view. If you align each half-frame tilt-series independently, then the field of view is nominally different.

To answer your last question: Dose weighting does not prevent good denoising from my experience. Dose weighting increases the contrast of the tomogram before denoising, which limits the amount of contrast enhancement that subsequent denoising can provide.

Best, -Alex

dgvjay commented 4 years ago

Thanks for the feedback, Alex.

[1] So I will now create a transform-file (e.g, the .xf file in IMOD) doing alignment on the stack created using all the frames. [2] This transform-file (.xf) will then be used to create two halves tomogram from the _even.st and _odd.st. These two tomograms will be used to generate a self-trained model for denoising the same tomogram.

I have got some related questions. [a] I sometimes generate 3D-CTF corrected tomograms (using NovaCTF). Do you think it is good enough to use the self-trained model generated on non-CTF corrected tomograms (from [2]) to denoise the 3D-CTF corrected tomogram? Or would you recommend generating two-halves of 3D-CTF corrected tomograms and then generating a new 3D-CTF corrected self-trained model for specifically denoising the 3D-CTF corrected tomogram? [b] Have you tried running the denoising on tomograms generated from Fourier-inversions (e.g., EMAN2's) instead of weighted back-projection? I will generate self-trained models on two-halves tomogram generated from EMAN2's Fourier-Inversion method and then apply the model on the tomogram generated from EMAN2. I will see how the results look like and maybe post here as well.

Thanks again and cheers, Digvijay

alexjnoble commented 4 years ago

Hi Digvijay,

[1] and [2] look right to me!

[a] We haven't tried training on CTF corrected tomograms. I'm not sure if it will be different because noise is noise so as long as all frequencies are represented in the training set, then the model will work the same I think. Try it and report back if you wish. [b] No, we have only used Tomo3D WBP and SIRT for training and denoising. Try this and report back too if you wish.

Best, -Alex

dgvjay commented 4 years ago

Thanks, Alex for such quick replies.

I will try to post results of denoising the 3D-CTF corrected and Fourier-inversion tomograms.

One last question: [1] I am hoping to do the SNR calculation (using the 2nd Independent method of your manuscript) for each tilt-series/tomogram and get statistics on how much the self-trained models improve the SNR. Did you use some IMOD/Appion-Protomo or EMAN2 family of functions to do the SNR calculations? Or wrote the function from the scratch? Do you think it is worth doing SNR for each tilt-series/tomogram?

alexjnoble commented 4 years ago

Hi Digvijay,

@tbepler did the SNR calculations. I think he wrote his own scripts to do them.

I don't see any reason why you would want to do them too. Denoising should be used for visualization and object identification; apply plenty of caution using denoising in sub-tomogram processing. So your eyes should be the judge as to which denoising model and reconstruction method works best for you.

Best, -Alex

dgvjay commented 4 years ago

Makes sense. Thanks, Alex

tbepler commented 4 years ago

@dgvjay Actually, the SNR calculations are very simple. Denoise each half frame tomogram and calculate the correlation between the denoised half and the raw other half (let this be CCC). The SNR is then CCC/(1-CCC). I don't know if there is software than can do this calculation automatically. I wrote my own script, but this feature is not (yet) included in Topaz. A short python snipit using topaz and numpy to calculate this is:

import topaz.mrc as mrc
import numpy as np

path_to_denoised = ...
with open(path_to_denoised, 'rb') as f:
    content = f.read()
denoised,_,_ =mrc.parse(content)

path_to_raw = ...
with open(path_to_raw, 'rb') as f:
    content = f.read()
raw,_,_ = mrc.parse(content)

# calculate correlation between denoised half and raw other half (e.g. denoise evens and compare with raw odds) 
ccc = np.corrcoef(denoised.ravel(), raw.ravel()) # correlation between half tomogram pixel values
ccc = ccc[0,1] # corrcoef returns a 2x2 matrix, can also use scipy.stats.pearsonr

snr = ccc/(1-ccc)
snr_db = 10*np.log10(snr) # the SNR in dB is 10*log_10(SNR)

print(ccc, snr, snr_db)
dgvjay commented 4 years ago

Thanks, Tristan.

Indeed, the calculations and your implementation are pretty neat.