❓ [QUESTION] About the data class AtomicData

mir-group / nequip

NequIP is a code for building E(3)-equivariant interatomic potentials

https://www.nature.com/articles/s41467-022-29939-5

MIT License

565 stars 124 forks source link

❓ [QUESTION] About the data class AtomicData #383

Closed floatingCatty closed 7 months ago

floatingCatty commented 8 months ago

If this isn't an issue with the code or a request, please use our GitHub Discussions instead.

Hello！

I am recently reading the code of NequIP packages and found something confusing about the usage of the data class AtomicData.

I want to use a batch to sample multiple structures at a time, and then recover it back to subgraphs after the model computation. However, the model relies on the AtomicDataDict class rather than the original Data. And if I load the batched dict output from the model, and use Batch.from_dict to transcript it into a Batch class, it cannot use the get_example() method to reconstruct subgraphs.

So I am curious why it is needed to use a dict type in the model rather than AtomicData or Batch type class.
And if possible, is there any way I could reconstruct a dict of batch data to its subgraph data ?

Thanks a lot for your time !

Linux-cpp-lisp commented 8 months ago

Hi @floatingCatty ,

Thanks for your interest in our code!

Good question: we use the dict type in the model for compatibility with TorchScript and compilation for deployment. I think the most straightforward way to achieve what you want is to keep around the original Batch object that you use as the input to the model, and to copy fields from the output dict of the model back into the input Batch object on which you can call get_example. (This of course assumes that you are mostly just predicting some new per-node/per-edge/per-graph quantities, and not changing the graph structure in the model, but of course if you are changing the graph structure this is a very different scenario.)

floatingCatty commented 7 months ago

Thanks !

I will try to implement what you said. I just want to predict some edge features, so what you just suggest is applicable.

Linux-cpp-lisp commented 7 months ago

Sounds good @floatingCatty ! For that, the easiest thing may be to look at the --output and --output-fields options of nequip-evaluate as well.