bytedance / cryostar

Leveraging Structural Prior and Constraints for Cryo-EM Heterogeneous Reconstruction
https://bytedance.github.io/cryostar/
Apache License 2.0
35 stars 1 forks source link

Blurry pred_gmm_image and minimal structural variation in CryoStar reconstruction #2

Closed kamzero closed 6 months ago

kamzero commented 6 months ago

Dear CryoStar authors,

Thank you for your outstanding work and open-source spirit. I followed the documentation at https://byte-research.gitbook.io/cryostar/a-minimal-case to reconstruct atomic structures on both the minimal synthetic case and the real case EMPIAR-10180.

To adapt to my RTX4090, I slightly modified the scripts by reducing the number of devices to 1 while increasing the batch size (I speculate that this should not have a significant impact). However, after training, the pred_gmm_image in the work_dirs directory is always blurry, and visualization of pca-1.pdb in ChimeraX reveals minimal structural variation among the conformations sampled along the first principal component. Here are input_image and pred_gmm_image: input_image pred_gmm_image

I am wondering if I have encountered any issues during my run. Could you please provide any suggestions?

Thank you for your time and assistance.

kamzero commented 6 months ago

input_image and pred_gmm_image of the real case EMPIAR-10180: 10180——input_image 10180_pred_gmm_image

dugu9sword commented 6 months ago

Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial.

Could you please provide a zipped output containing:

Thank you!

dugu9sword commented 6 months ago

BTW, the blurred GMM output is as expected since it is a coarse-graining model.

yilaili commented 6 months ago

I'd also like to add that the pred_gmm_image show the blurry "rings" is because of the corruption of the ctf function, which is normal in cryo-EM.

yilaili commented 6 months ago

From the images you posted, it seems the input images and the gmm models were off-centered. It may be caused by the pdb model was not fitted to the consensus map? You can check this by following this guide: https://byte-research.gitbook.io/cryostar/a-real-case-empiar-10180#reference-structure-preparation; Option 2: Prepare from scratch.

kamzero commented 6 months ago

Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial.

Could you please provide a zipped output containing:

  • the log
  • the config
  • the pdb at the last epoch
  • anything else which you think can be helpful for our analysis

Thank you!

Thank you for your kind response, I understand that my pred_gmm_image is normal and the pca-1.pdb visualization of synthetic case 1ake looks fine, here is my zip file. atom_1ake.zip

kamzero commented 6 months ago

From the images you posted, it seems the input images and the gmm models were off-centered. It may be caused by the pdb model was not fitted to the consensus map? You can check this by following this guide: https://byte-research.gitbook.io/cryostar/a-real-case-empiar-10180#reference-structure-preparation; Option 2: Prepare from scratch.

Thank you for the reminder. The pred_gmm_image of the real case EMPIAR-10180 does indeed suggest that the gmm models were off-centered. I will try again :D

dugu9sword commented 6 months ago

Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial. Could you please provide a zipped output containing:

  • the log
  • the config
  • the pdb at the last epoch
  • anything else which you think can be helpful for our analysis

Thank you!

Thank you for your kind response, I understand that my pred_gmm_image is normal and the pca-1.pdb visualization of synthetic case 1ake looks fine, here is my zip file. atom_1ake.zip

I have checked your config and found that the training batch is too large: 2048. Since the data only contains 50000 particles, this indicates that a training epoch only contains 25 update steps. A large batch size is not suggested for small datasets in deep learning. Maybe you can change it to 256 for better results.

data_loader = dict(
    train_batch_per_gpu=2048,
    train_ratio=0.975,
    val_batch_per_gpu=2048,
    workers_per_gpu=256)
kamzero commented 6 months ago

Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial. Could you please provide a zipped output containing:

  • the log
  • the config
  • the pdb at the last epoch
  • anything else which you think can be helpful for our analysis

Thank you!

Thank you for your kind response, I understand that my pred_gmm_image is normal and the pca-1.pdb visualization of synthetic case 1ake looks fine, here is my zip file. atom_1ake.zip

I have checked your config and found that the training batch is too large: 2048. Since the data only contains 50000 particles, this indicates that a training epoch only contains 25 update steps. A large batch size is not suggested for small datasets in deep learning. Maybe you can change it to 256 for better results.

data_loader = dict(
    train_batch_per_gpu=2048,
    train_ratio=0.975,
    val_batch_per_gpu=2048,
    workers_per_gpu=256)

Thanks for your kind response, I will give it a try.

kamzero commented 6 months ago

Dear authors,

I followed your suggestion and successfully ran the train_atom.py script, which gave me a nice set of conformer models. However, I am having issues running the train_density.py script. When I set extra_input_data_attr.given_z to None, it runs normally; however, when I set extra_input_data_attr.given_z to the z.npy obtained from the train_atom stage, I get batch["idx"] out of bounds in https://github.com/bytedance/cryostar/blob/d5c4d798858b0f136f1fd6f5417ce8e2c1c450a8/projects/star/train_density.py#L123 My self.given_z's shape here for 1ake is (1250,8), while batch["idx"] could be several tens of thousands. May I ask for your suggestion on this?

Part of the terminal output is as follows:

../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [31,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.

Any help would be greatly appreciated. Thank you!

dugu9sword commented 6 months ago

Looks strange, could you please provide the config file in both two stages and the z.npy file?

kamzero commented 6 months ago

Looks strange, could you please provide the config file in both two stages and the z.npy file?

Here are my config files and z.npy file: debug_train_density.zip

eugenejyuan commented 6 months ago

The z.npy file contains the latent codes for each image in the dataset, with the dimensionality of the latent code being 8. Given that the size of the dataset is 50,000, the correct shape for z.npy should be (50000,8). The shape (1250,8) is quite weird; I'm not sure where the problem lies.

kamzero commented 6 months ago

The z.npy file contains the latent codes for each image in the dataset, with the dimensionality of the latent code being 8. Given that the size of the dataset is 50,000, the correct shape for z.npy should be (50000,8). The shape (1250,8) is quite weird; I'm not sure where the problem lies.

Thank you for your response. I realized that I might have mixed in an older version of the code, which manually set the train_ratio to 0.975 and split the train/validation set based on this ratio.