ml-struct-bio / cryodrgn

Neural networks for cryo-EM reconstruction
http://cryodrgn.cs.princeton.edu
GNU General Public License v3.0
317 stars 76 forks source link

AssertionError #69

Closed Vijaya-cryoem closed 3 months ago

Vijaya-cryoem commented 3 years ago

Hi, I am using cryodrgn for my data which is the output files from cryosparc homogeneous refinement job.

When I use the below comments I ended with "AssertionError: Input rotations have shape (276737, 3, 3) but expected (480,3,3)"

Can you give some suggestions?

With regards Vijaya

cryodrgn train_vae particles.128.mrcs --ctf ctf.pkl --poses poses.pkl --zdim 8 -n 50 --uninvert-data -o tutorial/00_vae128 --multigpu > tutorial_00.log

Traceback (most recent call last): File "/programs/x86_64-linux//cryodrgn/0.3.2/cryodrgn/bin/cryodrgn", line 33, in sys.exit(load_entry_point('cryodrgn==0.3.2', 'console_scripts', 'cryodrgn')()) File "/programs/x86_64-linux/cryodrgn/0.3.2/cryodrgn_extlib/miniconda3-4.9.2_py37-rpp3/lib/python3.7/site-packages/cryodrgn-0.3.2-py3.7.egg/cryodrgn/main.py", line 52, in main args.func(args) File "/programs/x86_64-linux/cryodrgn/0.3.2/cryodrgn_extlib/miniconda3-4.9.2_py37-rpp3/lib/python3.7/site-packages/cryodrgn-0.3.2-py3.7.egg/cryodrgn/commands/train_vae.py", line 325, in main posetracker = PoseTracker.load(args.poses, Nimg, D, 's2s2' if do_pose_sgd else None, ind) File "/programs/x86_64-linux/cryodrgn/0.3.2/cryodrgn_extlib/miniconda3-4.9.2_py37-rpp3/lib/python3.7/site-packages/cryodrgn-0.3.2-py3.7.egg/cryodrgn/pose.py", line 66, in load assert rots.shape == (Nimg,3,3), f"Input rotations have shape {rots.shape} but expected ({Nimg},3,3)" AssertionError: Input rotations have shape (276737, 3, 3) but expected (480,3,3)

ArmandoPach commented 3 years ago

Hi,

Are your input particles from the same consensus refinement from where you got your CTF and poses files?

There seems to be some discrepancies between the files.

Armando

Vijaya-cryoem commented 3 years ago

Hi Armando,

Thanks for your email. I have sorted out the error.

I am working on Step 7) CryoDRGN high resolution training. I have got a new error message as follows:

$ cryodrgn train_vae particles.256.txt --ctf ctf.pkl --poses data/poses.pkl --zdim 8 -n 50 --uninvert-data --enc-dim 1024 --enc-layers 3 --dec-dim 1024 --dec-layers 3 --multigpu --ind tutorial/00_vae128/ind_keep.276737_particles.pkl --amp -o tutorial/01_vae256 > tutorial_01.log

/programs/x86_64-linux/cryodrgn/.manifest/capsules/cryodrgn/0.3.2/cryodrgn: line 2: 132043 Killed "${SB_BASE}"/x86_64-linux//cryodrgn/0.3.2/cryodrgn/bin/cryodrgn "$@"

Please share some suggestion on this.

Vijaya

-- Vijayakumar, B Mobile: +1 608 332-8317


From: ArmandoPach @.> Sent: Wednesday, July 21, 2021 1:56 PM To: zhonge/cryodrgn @.> Cc: VIJAYA KUMAR BALAKRISHNAN @.>; Author @.> Subject: Re: [zhonge/cryodrgn] AssertionError (#69)

Hi,

Are your input particles from the same consensus refinement from where you got your CTF and poses files?

There seems to be some discrepancies between the two of them.

Armando

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/zhonge/cryodrgn/issues/69#issuecomment-884418464, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AU22AXAHMVRFTQCKWMVJRRTTY4J4NANCNFSM5AKNHUOQ.

ArmandoPach commented 3 years ago

HI Vijaya,

Is that the whole error message you get?

Armando

Vijaya-cryoem commented 3 years ago

Hi Armando,

Yes. This is the full error message.

Vijaya


From: ArmandoPach @.***> Sent: Friday, July 23, 2021 11:19 AM To: zhonge/cryodrgn Cc: VIJAYA KUMAR BALAKRISHNAN; Author Subject: Re: [zhonge/cryodrgn] AssertionError (#69)

HI Vijaya,

Is that the whole error message you get?

Armando

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/zhonge/cryodrgn/issues/69#issuecomment-885750413, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AU22AXDWRO7G55FCYRPNIATTZGJABANCNFSM5AKNHUOQ.

zhonge commented 3 years ago

There should be additional error messages perhaps in a different log file that says why it was "Killed".

If I had to guess, it would be that you ran over some memory limit (if you're training on 276k D=256 particles). I would try the new cryodrgn preprocess pipeline that reduces the memory usage by ~3x. My back of the envelope calculation is that you'll need ~215GB of RAM for this dataset.

childek2210 commented 4 months ago

I am having a similar issue as Vijaya-cryoem and would like advice.

Traceback (most recent call last): File "/cluster/apps/cryodrgn/2.0.0-beta/bin/cryodrgn", line 8, in sys.exit(main()) File "/cluster/apps/cryodrgn/2.0.0-beta/lib/python3.9/site-packages/cryodrgn/main.py", line 72, in main args.func(args) File "/cluster/apps/cryodrgn/2.0.0-beta/lib/python3.9/site-packages/cryodrgn/commands/train_vae.py", line 671, in main posetracker = PoseTracker.load( File "/cluster/apps/cryodrgn/2.0.0-beta/lib/python3.9/site-packages/cryodrgn/pose.py", line 93, in load assert rots.shape == ( AssertionError: Input rotations have shape (124822, 3, 3) but expected (270,3,3)

I agree that there is most likely a disagreement between my CTF and poses files, but am unsure to correct this. Any advice would be greatly appreciated.

michal-g commented 4 months ago

Just to confirm, since this is an older issue, can you check which version of cryoDRGN you are running with the command cryodrgn --version?

I would then try loading the data in your CTF and pose files within a Python session to try and figure out why they are different:

import pickle
with open("your_ctf_file.pkl", 'rb') as f:
    ctf_data = pickle.load(f)

# inspect the contents and dimensions of the CTF data
ctf_data.shape
ctf_data[0, :]

However, the most likely source of the problem is in some process upstream of cryoDRGN that was used to create the original CTFs and poses — what did you use for this?

childek2210 commented 4 months ago

This is using cryodrgn/2.0.0-beta. It's through one of the national cryoEM centers so I don't have access to some of the main cryoDRGN directories. We're also undergoing maintenance so I won't be able to access these files for a few days, but I will try your suggested analysis once it's back online.

FWIW I have since determined that this may be due to an accidentally deleted patch motion correction job upstream of the particles, as you hinted at. I have since re-ran patch potion and re-extracted my particles, but continue to encounter the same error when running cryoDRGN train. People have suggested that cryoDRGN may not like these "old" particles and I should start fresh which I have. I'll keep you posted and update this when I can.

amadan24 commented 3 months ago

I am getting a similar assertion error. This is the full command and error message:

cryodrgn train_vae /home/earlab/Desktop/jdgf/recent_Ctf/particles.128.mrcs --ctf /home/earlab/Desktop/jdgf/recent_Ctf/ctf.copy.pkl --poses /home/earlab/Desktop/jdgf/recent_Ctf/poses.pkl --zdim 8 -n 50 --enc-dim 256 --enc-layers 3 --dec-dim 256 --dec-layers 3 -o /home/earlab/Desktop/jdgf/recent_Ctf/ > train.log Traceback (most recent call last): File "/programs/x86_64-linux//cryodrgn/1.1.2/cryodrgn/bin/cryodrgn", line 8, in sys.exit(main()) File "/programs/x86_64-linux/cryodrgn/1.1.2/miniconda/lib/python3.9/site-packages/cryodrgn/main.py", line 72, in main args.func(args) File "/programs/x86_64-linux/cryodrgn/1.1.2/miniconda/lib/python3.9/site-packages/cryodrgn/commands/train_vae.py", line 663, in main posetracker = PoseTracker.load( File "/programs/x86_64-linux/cryodrgn/1.1.2/miniconda/lib/python3.9/site-packages/cryodrgn/pose.py", line 74, in load assert rots.shape == ( AssertionError: Input rotations have shape (96980, 3, 3) but expected (240,3,3)

How do I fix this

michal-g commented 3 months ago

These issues could also be caused by out-of-memory errors as well as problems with the format of the particle stacks and CTF/poses — for both of these cases (@amadan24 and @childek2210), you can try rerunning your cryodrgn training command using the --lazy flag for more memory-efficient data processing (or preprocess as Ellen mentioned above if you are using cryoDRGN v2).

amadan24 commented 3 months ago

Hey Michal,

I was wondering if my team and I could meet with you on zoom sometime to sort out these issues. Every time we resolve one, there are a bunch more that come up. Would that be okay with you and if so what times would work? Thanks!

michal-g commented 3 months ago

For sure! Let's figure out the meeting time over email — I can be reached at mgrzad@princeton.edu

michal-g commented 3 months ago

After reviewing this issue together, we discovered that this was caused by using the wrong input particle stack which only had 480 entries.

The amount of people running into similar mishaps across our issue threads reflects that the error message cryoDRGN passes in this situation (from cryodrgn/pose.py) isn't quite adequate:

assert rots.shape == (
    Nimg,
    3,
    3,
), f"Input rotations have shape {rots.shape} but expected ({Nimg},3,3)"

Thus I will be updating it to something more like this for future releases:

if rots.shape[0] != Nimg:
    raise ValueError(
        f"Input # of pose rotations {rots.shape[0]} "
        f"does not match number of particle images {Nimg} "
        f"— double-check input files!"
    )
if rots.shape[1:] != (3, 3):
    raise ValueError(
        f"Wrong format for input rotations; "
        f"expected an array of dimensions {Nimg=}x3x3 !"
    )

Thanks for your patience everyone!