IsoNet-cryoET / IsoNet

Self-supervised learning for isotropic cryoET reconstruction
https://www.nature.com/articles/s41467-022-33957-8
MIT License
67 stars 12 forks source link

ValueError during extraction #13

Open kmradford opened 3 years ago

kmradford commented 3 years ago

Hi there! Thank you for developing IsoNet - it's really exciting! I've been reprocessing some data with it, but keep getting the following ValueError with each of my data sets when I reach the first extraction step (using the GUI):

08-30 14:08:44, INFO [prepare.py:63] Extract from deconvolved tomogram ./deconv/4-PB-NLGP-S_ts10.mrc Traceback (most recent call last): File "/data/kate/IsoNetFolder/IsoNet/bin/isonet.py", line 287, in extract extract_subtomos(d_args) File "/data/kate/IsoNetFolder/IsoNet/preprocessing/prepare.py", line 79, in extract_subtomos seeds=create_cube_seeds(orig_data, it.rlnNumberSubtomo, settings.crop_size,mask=mask_data) File "/data/kate/IsoNetFolder/IsoNet/preprocessing/cubes.py", line 31, in create_cube_seeds sample_inds = np.random.choice(len(valid_inds[0]), nCubesPerImg, replace=len(valid_inds[0]) < nCubesPerImg) File "mtrand.pyx", line 902, in numpy.random.mtrand.RandomState.choice ValueError: a must be greater than 0 unless no samples are taken

I've had a look through the cited scripts from the error message, but none of them trace back to the origin of the value error, and I'm not sure how to open a .mrc file to see what in my data is set as 0.

Do you have any advice about what the issue could be, and how I can resolve it?

Thank you!

procyontao commented 3 years ago

Hi,

I guess valid regions in map to extract subtomograms is 0. Does this happen when using the demo dataset? And does this happen when extracting withnot using mask?

cheers

procyontao commented 2 years ago

This error could also due to the thickness of the tomogram is less than 96 pixels. IMOD command such as "clip resize -oz 100 input.mrc output.mrc" will be a temporary solution. Need to be considered in IsoNet.

jhennies commented 2 years ago

Hi!

I am having the same issue and indeed with the demo dataset. As far as I understand this happens due to the mask being all-zero for all micrographs except the first one in the list.

If I switch the order of the micrographs, still the first one in the list has a valid mask and the other masks are all-zero (so it's not data-dependent).

When I skip the mask generation step, the extraction works.

cheers

procyontao commented 2 years ago

Hi Jhennies,

This is strange. I will look into it. Are you using GUI or command line?

jhennies commented 2 years ago

Hi @procyontao,

thanks for looking into it! Initially, I used the GUI.

I now also tested using the commands generated by "only print command" but the result stays the same.

The mask is only properly computed for the first entry of the tomograms.star file. I tested without the gui by changing the order in the star file by text editor.

I hope these observations help to narrow down the issue, if you need any further details let me know!

cheers, Julian

procyontao commented 2 years ago

Hi,

I could not find the problem.

I ran it again with the most recent IsoNet version, everything seems to be OK.

I used the following command: isonet.py prepare_star tomograms isonet.py deconv tomograms.star isonet.py make_mask tomograms.star

My tomograms.star looks like this image

My deconvolved tomograms is here: image

My masks look like this: image

jhennies commented 2 years ago

Thanks for checking! I tried exactly what you posted and I still get the empty masks. I tried both the code from the master branch (pull from today) as well as your last release (v0.1).

I'll dive a bit into the code, maybe I can figure out where the empty masks come from. I'll let you know if I find anything.

jhennies commented 2 years ago

I located the issue and it is probably the weirdest behavior I've seen in a while:

The convolution in this line: https://github.com/Heng-Z/IsoNet/blob/7b8871f5e27744f4a2f24341beb088575845c388/util/filter.py#L21 returns an all-NaN array when the stdmask function is called the second, third, ... time. Even if supplied with EXACTLY the same input data as in the first call. The other convolve calls in the stdmask function below work properly.

When the method keyword argument is not set, the convolve function calls use method='fft' on my system. If I set that to method='direct' for the first convolve call in stdmask, the problem is essentially shifted to the following convolve call that uses method='fft'. In other words, for my system I can solve the issue by either adding method='direct' for all convolve calls (which is obviously terribly slow) or add a dummy convolve call above the first convolve in the stdmask function:

def stdmask(tomo,side=10,threshold=60):
    from scipy.signal import convolve
    # print('std_filter')
    tomosq = tomo**2
    ones = np.ones(tomo.shape)
    eps = 0.01
    kernel = np.ones((2*side+1, 2*side+1, 2*side+1))

    dummy = convolve(np.zeros((10, 10)), [[0, 0, 0]], mode="same", method='fft')
    s = convolve(tomo, kernel, mode="same")
    s2 = convolve(tomosq, kernel, mode="same")
    ns = convolve(ones, kernel, mode="same")

    out = np.sqrt((s2 - s**2 / ns) / ns + eps)
    # out = out>np.std(tomo)*threshold
    out  = out>np.percentile(out, 100-threshold)

    return out.astype(np.uint8)

I don't think this is a particularly good solution but for me does the trick. Also this doesn't seem to be an IsoNet issue, probably more due to my system and/or the scipy version. Out of curiosity, @procyontao what scipy version do you have on your system?

procyontao commented 2 years ago

Glad to hear you work out the problem. We will test it and may consider incorporate your dummy deconvolution in IsoNet.

My scipy version is 1.7.1