Closed kamzero closed 6 months ago
input_image and pred_gmm_image of the real case EMPIAR-10180:
Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial.
Could you please provide a zipped output containing:
Thank you!
BTW, the blurred GMM output is as expected since it is a coarse-graining model.
I'd also like to add that the pred_gmm_image show the blurry "rings" is because of the corruption of the ctf function, which is normal in cryo-EM.
From the images you posted, it seems the input images and the gmm models were off-centered. It may be caused by the pdb model was not fitted to the consensus map? You can check this by following this guide: https://byte-research.gitbook.io/cryostar/a-real-case-empiar-10180#reference-structure-preparation; Option 2: Prepare from scratch.
Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial.
Could you please provide a zipped output containing:
- the log
- the config
- the pdb at the last epoch
- anything else which you think can be helpful for our analysis
Thank you!
Thank you for your kind response, I understand that my pred_gmm_image is normal and the pca-1.pdb visualization of synthetic case 1ake looks fine, here is my zip file. atom_1ake.zip
From the images you posted, it seems the input images and the gmm models were off-centered. It may be caused by the pdb model was not fitted to the consensus map? You can check this by following this guide: https://byte-research.gitbook.io/cryostar/a-real-case-empiar-10180#reference-structure-preparation; Option 2: Prepare from scratch.
Thank you for the reminder. The pred_gmm_image of the real case EMPIAR-10180 does indeed suggest that the gmm models were off-centered. I will try again :D
Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial. Could you please provide a zipped output containing:
- the log
- the config
- the pdb at the last epoch
- anything else which you think can be helpful for our analysis
Thank you!
Thank you for your kind response, I understand that my pred_gmm_image is normal and the pca-1.pdb visualization of synthetic case 1ake looks fine, here is my zip file. atom_1ake.zip
I have checked your config and found that the training batch is too large: 2048. Since the data only contains 50000 particles, this indicates that a training epoch only contains 25 update steps. A large batch size is not suggested for small datasets in deep learning. Maybe you can change it to 256 for better results.
data_loader = dict(
train_batch_per_gpu=2048,
train_ratio=0.975,
val_batch_per_gpu=2048,
workers_per_gpu=256)
Hi kamzero, thanks for your attention! I am sorry that I can not point out what is wrong with your trial. Could you please provide a zipped output containing:
- the log
- the config
- the pdb at the last epoch
- anything else which you think can be helpful for our analysis
Thank you!
Thank you for your kind response, I understand that my pred_gmm_image is normal and the pca-1.pdb visualization of synthetic case 1ake looks fine, here is my zip file. atom_1ake.zip
I have checked your config and found that the training batch is too large: 2048. Since the data only contains 50000 particles, this indicates that a training epoch only contains 25 update steps. A large batch size is not suggested for small datasets in deep learning. Maybe you can change it to 256 for better results.
data_loader = dict( train_batch_per_gpu=2048, train_ratio=0.975, val_batch_per_gpu=2048, workers_per_gpu=256)
Thanks for your kind response, I will give it a try.
Dear authors,
I followed your suggestion and successfully ran the train_atom.py
script, which gave me a nice set of conformer models. However, I am having issues running the train_density.py
script. When I set extra_input_data_attr.given_z
to None
, it runs normally; however, when I set extra_input_data_attr.given_z
to the z.npy
obtained from the train_atom
stage, I get batch["idx"] out of bounds
in https://github.com/bytedance/cryostar/blob/d5c4d798858b0f136f1fd6f5417ce8e2c1c450a8/projects/star/train_density.py#L123
My self.given_z's shape here for 1ake is (1250,8), while batch["idx"] could be several tens of thousands. May I ask for your suggestion on this?
Part of the terminal output is as follows:
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [31,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
Any help would be greatly appreciated. Thank you!
Looks strange, could you please provide the config file in both two stages and the z.npy
file?
Looks strange, could you please provide the config file in both two stages and the
z.npy
file?
Here are my config files and z.npy file: debug_train_density.zip
The z.npy
file contains the latent codes for each image in the dataset, with the dimensionality of the latent code being 8. Given that the size of the dataset is 50,000, the correct shape for z.npy
should be (50000,8). The shape (1250,8) is quite weird; I'm not sure where the problem lies.
The
z.npy
file contains the latent codes for each image in the dataset, with the dimensionality of the latent code being 8. Given that the size of the dataset is 50,000, the correct shape forz.npy
should be (50000,8). The shape (1250,8) is quite weird; I'm not sure where the problem lies.
Thank you for your response. I realized that I might have mixed in an older version of the code, which manually set the train_ratio to 0.975 and split the train/validation set based on this ratio.
Dear CryoStar authors,
Thank you for your outstanding work and open-source spirit. I followed the documentation at https://byte-research.gitbook.io/cryostar/a-minimal-case to reconstruct atomic structures on both the minimal synthetic case and the real case EMPIAR-10180.
To adapt to my RTX4090, I slightly modified the scripts by reducing the number of devices to 1 while increasing the batch size (I speculate that this should not have a significant impact). However, after training, the pred_gmm_image in the work_dirs directory is always blurry, and visualization of pca-1.pdb in ChimeraX reveals minimal structural variation among the conformations sampled along the first principal component. Here are input_image and pred_gmm_image:
I am wondering if I have encountered any issues during my run. Could you please provide any suggestions?
Thank you for your time and assistance.