jlaw9 / crystal-gnn

Work in progress on graph neural networks for crystal energy prediction
0 stars 0 forks source link

More informative decoration labels #2

Open jlaw9 opened 2 years ago

jlaw9 commented 2 years ago

@prashungorai, @pstjohn, While we're on the subject, I was thinking about a more informative decoration label to remove this potential confusion. Rather than a decoration index, we could directly list the elemental substitutions. For example for Zn4Ti1N4_sg62_icsd_095649_1, Ti replaces Ge, N replaces S, and Zn replaces Li. We could use the label Zn4Ti1N4_sg62_icsd_095649_N-S_Ti-Ge_Zn-Li (the substitution string could be sorted by battery element).

Once we agree on a nomenclature, I can go back and create the mappings for the battery dataset and update the MCTS action space to use the new labels.

jlaw9 commented 2 years ago

We will now match prototypes across composition types. To differentiate how the prototype was decorated, an informative label would need to specify which element was placed at which site in the structure. To ensure we have a unique ordering of the sites, we would sort by the X, then Y, then Z coordinates. Since we have prototype structures with up to 50 atoms, the max label size would be ~100 characters for the site index (50*2), up to 100 for the elements (50*2), and 0 or 49 characters for delimiters for a total of around 200 characters.

@pstjohn Should we just use the hash of that string for the label?