Failure during running - Githubissues

campbellsam77 commented 7 months ago

Hello,

I am fairly new to coding (I've been coding for ~6-9 months so pardon any incorrect grammer/terms I may use). I am trying to use your Distance-AF. I tried generating Alphafold embedding data using local colabfold, however I run into an issue where your embedding data (the .npz file) is an array of [, 384] and mine is [, 256]. This is the error I am encountering:

Checkpoints (model and optimizer) loaded from ./model_dir
----------------- Starting Training ---------------
  Num examples = 1
  Num Epochs = 10000
  Batch Size = 1
Traceback (most recent call last):
  File "/home/sc2550/Distance-AF/main.py", line 8, in <module>
    train(args)
  File "/home/sc2550/Distance-AF/Train/train.py", line 105, in train
    translation, outputs, pred_frames = ckpt(run_ckpt,model,embedding, single_repr_batch, aatype_batch, batch_gt_frames,dummy)
  File "/home/sc2550/anaconda3/envs/dist-af/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 211, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/home/sc2550/anaconda3/envs/dist-af/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 90, in forward
    outputs = run_function(*args)
  File "/home/sc2550/Distance-AF/Train/train.py", line 104, in run_ckpt
    return model(embedding, single_repr_batch, aatype_batch, batch_gt_frames)
  File "/home/sc2550/anaconda3/envs/dist-af/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/sc2550/Distance-AF/Model/Dist_AF.py", line 18, in forward
    output_bb, translation, outputs = self.structure_module(single_repr, embedding, f=aatype, mask=batch_gt_frames['seq_mask'])
  File "/home/sc2550/anaconda3/envs/dist-af/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/sc2550/Distance-AF/Model/ipa_openfold.py", line 620, in forward
    s = self.layer_norm_s(s)
  File "/home/sc2550/anaconda3/envs/dist-af/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/sc2550/anaconda3/envs/dist-af/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 173, in forward
    return F.layer_norm(
  File "/home/sc2550/anaconda3/envs/dist-af/lib/python3.9/site-packages/torch/nn/functional.py", line 2346, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[384], expected input with shape [*, 384], but got input of size[1, 33, 256]

I then tried running your script through your Google colab script and am encountering this error:

Solving environment: ...working... warning  libmamba Added empty dependency for problem type SOLVER_RULE_UPDATE
failed

LibMambaUnsatisfiableError: Encountered problems while solving:
  - package conda-23.5.2-py310h06a4308_0 requires python >=3.10,<3.11.0a0, but none of the providers can be installed

Could not solve for environment specs
The following packages are incompatible
├─ conda 23.5.2  is installable with the potential options
│  ├─ conda 23.5.2 would require
│  │  └─ python >=3.10,<3.11.0a0 , which can be installed;
│  ├─ conda 23.5.2 would require
│  │  └─ python >=3.11,<3.12.0a0 , which can be installed;
│  ├─ conda 23.5.2 would require
│  │  └─ python >=3.8,<3.9.0a0 , which can be installed;
│  └─ conda 23.5.2 would require
│     └─ python >=3.9,<3.10.0a0 , which can be installed;
└─ pin-1 is not installable because it requires
   └─ python 3.12.* , which conflicts with any installable versions previously reported.

Pins seem to be involved in the conflict. Currently pinned specs:
 - python 3.12.* (labeled as 'pin-1')

I have tried changing the python version based on the ones listed in the error with no luck. Could you please be of any assistance?

Zhang038 commented 7 months ago

Hello, Thanks for your interests on our work. I think the issue here is that you used local colabfold, which only save msa representations by default as you have got, the hidden dimension is 256, not the one we want which is called single representations with dimension 384. Can you try our google colab script to get single representations. It should be runnable and output the embedding with 384 dimension.

campbellsam77 commented 7 months ago

Yes, I tried running that and both google scripts failed to run as well. Please refer to the second half of the post above.

Thank you!

Zhang038 commented 7 months ago

Sorry to hear that, is it okay that you provide more details about the errors you got while running on the google colab? If there is some bugs in the script, I am supposed to resolve it.

campbellsam77 commented 7 months ago

Yes, I have provided the error I received from FullMSA in the second half of the post above. I can provide the error received for the MNseq but it was similar in the sense that the environment failed to load properly.

Zhang038 commented 7 months ago

Hi, sorry for the issue again, I found this might be a problem caused by conda version updated. I have updated the conda version, and now first step is able to run without error, but it will cause another error at step 4 about haiku package. So the conda update brought about some dependant issues, I went through the issues at deepmind repo, and the same problem happened on the official AF2 colab version at issue 925. I can't fix the bug on google colab immediately, as I have to wait for deepmind team to resolve it first. On the other hand, you are correct that running on local af2 repo is a feasible way, but some modification needed, please drop me an email zhang038@purdue.edu,we can discuss more on how to modify it to get single embeddings if you are interested.

Zhang038 commented 7 months ago

Update at 11:56 am on Apr. 11 BTW, I tried to resolve the issue in the MMseq jupyter file. From my point of view, the error is probably caused by the incompatibility among related packages, like jax. I took a reference of colabfold's env file, and updated the requirement file. Now the google colab jupyter file of MMseq version is runnable to generate embeddings from my test. Please have a try if mmseq is acceptable for your target, if further problems occur, please feel free to let me know.

campbellsam77 commented 7 months ago

I will try the MMseq! Thank you!

kiharalab / Distance-AF

Failure during running #3