txie-93 / cgcnn

Crystal graph convolutional neural networks for predicting material properties.
MIT License
651 stars 309 forks source link

Question: How to make dataset? #15

Closed motonuko closed 4 years ago

motonuko commented 4 years ago

Hi

I have questions about datasets.

Materials Project database and API had changed a lot from when this repository started, and I can’t get the whole dataset written in the “cgcnn/data/material/mp-ids-○○.csv” file.

In CGCNN paper(https://arxiv.org/abs/1710.10324 ), it is written that

After removing ill-converged crystals, the full database has 46744 materials covering 87 elements, ...

My questions are as follows.

Thanks,

txie-93 commented 4 years ago

Hi, thanks for your question!

from pymatgen.ext.matproj import MPRester

with MPRester("API_KEY") as m:
    results = m.query(criteria={'material_id': 'mp-1234'},
                      properties=['formation_energy_per_atom', 'structure'])
    formation_energy = results['formation_energy_per_atom']
    structure = results['structure']

You can get the "API_KEY" by registering an account at their website.

Hope that is helpful!

motonuko commented 4 years ago

Thanks for your explanation! It is vary helpful for me.

I found a change in materials project (https://discuss.matsci.org/t/change-in-materials-project-ids/1268 ), and some 'materials_id' has changed to 'task_id'. Taking this change into account, I got similar datasets.

Thanks,