IntelLabs / matsciml

Open MatSci ML Toolkit is a framework for prototyping and scaling out deep learning models for materials discovery supporting widely used materials science datasets, and built on top of PyTorch Lightning, the Deep Graph Library, and PyTorch Geometric.
MIT License
155 stars 25 forks source link

add missing data in format atoms #276

Closed lory-w closed 3 months ago

lory-w commented 3 months ago

Related to #275 Part of the problem is ase atoms is missing ptr and batch when converting to dictionary.

melo-gonzo commented 3 months ago

A note for the future if any other errors pop up around this : we can also use concatenate_keys from matsciml.datasets.utils to add these attributes in a more robust way.

laserkelvin commented 3 months ago

@melo-gonzo can you elaborate?

Just checking to make sure I understand what you mean

melo-gonzo commented 3 months ago

I patched this in my mp-tests branch by just using that function as a final step before passing into .predict in the calculate function. Just wanted to note it in case we see any issues again

    def calculate(
        self,
        atoms=None,
        properties: list[Literal["energy", "forces"]] = ["energy", "forces"],
        system_changes=...,
    ) -> None:
        # retrieve atoms even if not passed
        Calculator.calculate(self, atoms)
        # get into format ready for matsciml model
        data_dict = self._format_pipeline(atoms)
        # run the data structure through the model
        data_dict = concatenate_keys([data_dict])
        output = self.task_module.predict(data_dict)
        ...