google-deepmind / materials_discovery

Apache License 2.0
838 stars 131 forks source link

The column 'Dimensionality Cheon' in stable_materials_summary.csv is empty #3

Open burubaxair opened 7 months ago

burubaxair commented 7 months ago

The column 'Dimensionality Cheon' in stable_materials_summary.csv is empty

amilmerchant commented 7 months ago

Apologies! Will upload an updated summary shortly.

AntObi commented 7 months ago

HI @amilmerchant , the Is Train and Decomposition Energy Per Atom Relative columns are also empty.

The band gap column also has some missing values: 288,215 /384,938 are non-null. Some of the values are np.nan (missing values) and some are infinite.

amilmerchant commented 7 months ago

Working on this! Sorry, more issues related to data migrations for the missing columns. I'll add more documentation about the csv as well.

The columns 'Dimensionality Cheon' and 'Is Train' will be populated shortly and 'Decomposition Energy Per Atom Relative' slightly thereafter.

For data such as the band gap, we have released the calculations that we have. We have not run calculations for all materials, so None likely corresponds to missing calculations np.nan often arises from parsing VASP outputs: similar to this catch: https://github.com/materialsproject/pymatgen/blob/ebd776900edfc45bd3b9f045e1e04db1e2d2752b/pymatgen/io/vasp/outputs.py#L133. I'll try to find another way to parse these files or rerun. If there is interest in bandgaps for specific family that we missed, let's chat!

amilmerchant commented 7 months ago

I'll reply in this thread as there is progress!

amilmerchant commented 7 months ago

Added both the dimensionality and is train columns, as well as dataset versioning documentation to the corresponding markdown file.

Sxmourai commented 7 months ago

We have not run calculations for all materials, so None likely corresponds to missing calculations

If you need help for computer power maybe we could make an api to get some materials to test and then send the ones that are stable. I think there's a lot of people out there that would like to help (including me)! The only problem is that we need to be sure that the people sending the tested materials are really tested and they are not sending false information.

amilmerchant commented 7 months ago

Hi, apologies for the delayed response. It seems there is more interest in the bandgap and additional measurements than expected. As these were not part of the results of the original paper, we did not emphasize completeness here.

However, we'll put more effort completing this column and providing additional details about these materials in a future upgrade. Please stay tuned :)

I'll leave this issue open for now and close when we have more details.