mir-group / nequip

NequIP is a code for building E(3)-equivariant interatomic potentials
https://www.nature.com/articles/s41467-022-29939-5
MIT License
565 stars 124 forks source link

🐛 [BUG] Cannot run nequip-train with provided example #370

Closed Quicken90905 closed 9 months ago

Quicken90905 commented 9 months ago

I tried run the example for a minimal training with energies from ASEDataset:

Example: Given an atomic data stored in "H2.extxyz" that looks like below:

    ```H2.extxyz
    2
    Properties=species:S:1:pos:R:3 energy=-10 user_label=2.0 pbc="F F F"
     H       0.00000000       0.00000000       0.00000000
     H       0.00000000       0.00000000       1.02000000
The yaml input should be

```
dataset: ase
dataset_file_name: H2.extxyz
ase_args:
  format: extxyz
include_keys:
  - user_label
key_mapping:
  user_label: label0
chemical_symbols:
  - H
```

But when I run ```nequip-train test.yaml``` (with test.yaml having the provided contents) I get ```runtimeerror: failed to build object with prefix `dataset` using builder `npzdataset```.
heyfavour commented 9 months ago

me too it's normal in linux。but i run it in windows ,it will raise RuntimeError: Failed to build object with prefix dataset using builder NpzDataset

Linux-cpp-lisp commented 9 months ago

This error could come from a lot of different root causes, can you please post the rest of the error message, even if Windows was the cause, for future reference?

Quicken90905 commented 9 months ago

I am using Ubuntu 20.04, on Windows 10 I wasn't able to get this far and after seeing this issue decided not to continue with Windows. First, the error was caused by the lack of a r_max variable in the yaml file, but after adding it as r_max: 4 the error still presists:

nequip-train test.yaml
Torch device: cpu
Processing dataset...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/dataset.py", line 790, in _ase_dataset_reader
    if global_index in include_frames
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/AtomicData.py", line 449, in from_ase
    **add_fields,
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/AtomicData.py", line 330, in from_points
    return cls(edge_index=edge_index, pos=torch.as_tensor(pos), **kwargs)
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/AtomicData.py", line 225, in __init__
    _process_dict(kwargs)
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/AtomicData.py", line 155, in _process_dict
    kwargs[k] = v.unsqueeze(-1)
AttributeError: 'numpy.int64' object has no attribute 'unsqueeze'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/utils/auto_init.py", line 232, in instantiate
    instance = builder(**positional_args, **final_optional_args)
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/dataset.py", line 887, in __init__
    type_mapper=type_mapper,
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/dataset.py", line 166, in __init__
    super().__init__(root=root, type_mapper=type_mapper)
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/dataset.py", line 50, in __init__
    super().__init__(root=root, transform=type_mapper)
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/utils/torch_geometric/dataset.py", line 91, in __init__
    self._process()
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/utils/torch_geometric/dataset.py", line 176, in _process
    self.process()
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/dataset.py", line 218, in process
    data = self.get_data()
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/dataset.py", line 964, in get_data
    datas = p.map(reader, list(range(n_proc)))
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/multiprocessing/pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
AttributeError: 'numpy.int64' object has no attribute 'unsqueeze'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lukas/anaconda3/envs/soc/bin/nequip-train", line 10, in <module>
    sys.exit(main())
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/scripts/train.py", line 72, in main
    trainer = fresh_start(config)
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/scripts/train.py", line 148, in fresh_start
    dataset = dataset_from_config(config, prefix="dataset")
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/data/_build.py", line 82, in dataset_from_config
    optional_args=config,
  File "/home/lukas/anaconda3/envs/soc/lib/python3.7/site-packages/nequip/utils/auto_init.py", line 236, in instantiate
    ) from e
RuntimeError: Failed to build object with prefix `dataset` using builder `ASEDataset`
heyfavour commented 9 months ago

my error caused by that handle of tmp file not released,in your code ,you rename tmp file but something is different in windows. so it cause error,but it's normal in linux

in win11 install miniconda ;install python 3.8 ; pip install nequip ; then run your demo you can reproducing errors