mir-group / allegro

Allegro is an open-source code for building highly scalable and accurate equivariant deep learning interatomic potentials
https://www.nature.com/articles/s41467-023-36329-y
MIT License
339 stars 45 forks source link

RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0 #110

Open gcassone-cnr opened 1 week ago

gcassone-cnr commented 1 week ago

Dear Developers,

I'm a new Allegro user. I'm just trying to run the simple input shown below


general

root: results/water-tutorial run_name: water seed: 42 dataset_seed: 42 append: true default_dtype: float32

-- network --

model_builders:

cutoffs

r_max: 4.5 avg_num_neighbors: auto

radial basis

BesselBasis_trainable: true PolynomialCutoff_p: 48

symmetry

l_max: 2 parity: o3_full

Allegro layers:

num_layers: 2 env_embed_multiplicity: 8 embed_initial_edge: true

two_body_latent_mlp_latent_dimensions: [32, 64, 128] two_body_latent_mlp_nonlinearity: silu two_body_latent_mlp_initialization: uniform

latent_mlp_latent_dimensions: [128] latent_mlp_nonlinearity: silu latent_mlp_initialization: uniform latent_resnet: true

env_embed_mlp_latent_dimensions: [] env_embed_mlp_nonlinearity: null env_embed_mlp_initialization: uniform

- end allegro layers -

Final MLP to go from Allegro latent space to edge energies:

edge_eng_mlp_latent_dimensions: [32] edge_eng_mlp_nonlinearity: null edge_eng_mlp_initialization: uniform

include_keys:

-- data --

dataset: ase
dataset_file_name: /content/cp2k/colab/AIMD_data/conc_wat_pos_frc.extxyz # path to data set file ase_args: format: extxyz

A mapping of chemical species to type indexes is necessary if the dataset is provided with atomic numbers instead of type indexes.

chemical_symbols:

logging

wandb: false

wandb_project: allegro-water-tutorial

verbose: info log_batch_freq: 10

training

n_train: 1000 n_val: 100 batch_size: 5 max_epochs: 100 learning_rate: 0.002 train_val_split: random shuffle: true metrics_key: validation_loss

use an exponential moving average of the weights

use_ema: true ema_decay: 0.99 ema_use_num_updates: true

loss function

loss_coeffs: forces: 1. total_energy:

optimizer

optimizer_name: Adam optimizer_params: amsgrad: false betas: !!python/tuple

metrics_components:

lr scheduler, drop lr if no improvement for 50 epochs

lr_scheduler_name: ReduceLROnPlateau lr_scheduler_patience: 50 lr_scheduler_factor: 0.5

early_stopping_lower_bounds: LR: 1.0e-5

early_stopping_patiences: validation_loss: 100


but at the 10th epoch I get the following error:


Traceback (most recent call last): File "/home/user/anaconda3/bin/nequip-train", line 8, in sys.exit(main()) ^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/scripts/train.py", line 115, in main trainer.train() File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/train/trainer.py", line 784, in train self.epoch_step() File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/train/trainer.py", line 919, in epoch_step self.batch_step( File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/train/trainer.py", line 814, in batch_step out = self.model(data_for_loss) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_graph_model.py", line 112, in forward data = self.model(new_data) ^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_rescale.py", line 144, in forward data = self.model(data) ^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_grad_output.py", line 85, in forward data = self.func(data) ^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_graph_mixin.py", line 366, in forward input = module(input) ^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/anaconda3/lib/python3.12/site-packages/allegro/nn/_allegro.py", line 612, in forward new_latents = cutoff_coeffs[active_edges].unsqueeze(-1) new_latents


RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0
********

Can you please suggest me what's wrong in my installation and how to fix this issue?

Many thanks in advance and best wishes,
Giuseppe Cassone
gcassone-cnr commented 2 days ago

Dear developers, Is this forum still active?