juglab / cryoCARE_pip

PIP package of cryoCARE
BSD 3-Clause "New" or "Revised" License
25 stars 14 forks source link

Latest version leads to crash during random coordinate sampling #38

Closed thorstenwagner closed 1 year ago

thorstenwagner commented 1 year ago

This is the header of my tomogram:

 Number of columns, rows, sections .....     512     720     250
 Map mode ..............................    2   (32-bit real)              
 Start cols, rows, sects, grid x,y,z ...    0     0     0     512    720    250
 Pixel spacing (Angstroms)..............   9.432      9.432      9.432    
 Cell angles ...........................   90.000   90.000   90.000
 Fast, medium, slow axes ...............    X    Y    Z
 Origin on x,y,z ..(inverted_in_file_)..    2.358       0.000       1179.    
 Minimum density ....................... -0.53415    
 Maximum density .......................  0.51181    
 Mean density .......................... -0.23894E-04
 tilt angles (original,current) ........  90.0   0.0   0.0   0.0   0.0   0.0
 Space group,# extra bytes,idtype,lens .        1        0        0        0

And I run into following exception during data extraction:

Traceback (most recent call last):
  File "/opt/user_software/miniconda3/envs/cryocare/bin/cryoCARE_extract_train_data.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/mnt/data/twagner/Projects/cryocare/src/cryoCARE_pip/cryocare/scripts/cryoCARE_extract_train_data.py", line 45, in <module>
    main()
  File "/mnt/data/twagner/Projects/cryocare/src/cryoCARE_pip/cryocare/scripts/cryoCARE_extract_train_data.py", line 27, in main
    dm.setup(config['odd'], config['even'], n_samples_per_tomo=config['num_slices'],
  File "/mnt/data/twagner/Projects/cryocare/src/cryoCARE_pip/cryocare/internals/CryoCAREDataModule.py", line 215, in setup
    self.val_dataset = CryoCARE_Dataset(tomo_paths_odd=tomo_paths_odd,
  File "/mnt/data/twagner/Projects/cryocare/src/cryoCARE_pip/cryocare/internals/CryoCAREDataModule.py", line 40, in __init__
    self.create_coordinate_lists()
  File "/mnt/data/twagner/Projects/cryocare/src/cryoCARE_pip/cryocare/internals/CryoCAREDataModule.py", line 107, in create_coordinate_lists
    self.coords.append(self.__create_coords_for_tomo__(even, odd, es))
  File "/mnt/data/twagner/Projects/cryocare/src/cryoCARE_pip/cryocare/internals/CryoCAREDataModule.py", line 121, in __create_coords_for_tomo__
    coords = self.create_random_coords(extraction_shape[0],
  File "/mnt/data/twagner/Projects/cryocare/src/cryoCARE_pip/cryocare/internals/CryoCAREDataModule.py", line 137, in create_random_coords
    y_coords = np.random.randint(y[0], y[1] - self.sample_shape[0], size=n_samples)
  File "mtrand.pyx", line 746, in numpy.random.mtrand.RandomState.randint
  File "_bounded_integers.pyx", line 1254, in numpy.random._bounded_integers._rand_int64
ValueError: low >= high

I've added a couple of prints in the create_random_coords method:

    def create_random_coords(self, z, y, x, n_samples):
        print(z)
        print(y)
        print(x)
        print(self.sample_shape)
        z_coords = np.random.randint(z[0], z[1] - self.sample_shape[0], size=n_samples)
        y_coords = np.random.randint(y[0], y[1] - self.sample_shape[0], size=n_samples)
        x_coords = np.random.randint(x[0], x[1] - self.sample_shape[0], size=n_samples)

        return np.stack([z_coords, y_coords, x_coords], -1)

This is the output:

[0, 250]
[0, 648]
[0, 512]
[72 72 72]
Computing normalization parameters:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [00:01<00:00, 452.69it/s]
[0, 250]
[648, 720]
[0, 512]
[72 72 72]

exception... see above

I will look into it, but with 0.2.0 this did not happen. But maybe @tibuch knows what might be the issue, as he made most of the changes for 0.2.1

Best, Thorsten

thorstenwagner commented 1 year ago

I think I discovered an edge case and the validation shape is wrongly calculated in those situations. Changing the split from 0.9 to 0.8 fixed the problem. I think we should make sure that this does not lead to a crash. Instead we should change the validation split to closest possible value that the user selected.

thorstenwagner commented 1 year ago

Until is does not happen again I will close this.