Open n0w0f opened 6 months ago
so you would like to have one vector per atom in a structure?
so you would like to have one vector per atom in a structure?
I would like to get a vector for an atoms, not in the context of the atom being in any particular structure, but standalone. for eg ( Na -> model -> vector).
so that i can see if all the alkali elements are similar for models trained with different representations
can i use the learned token embedding ? or do i even need to pass it through the model if it is is not in the context of structure ?
can i use the learned token embedding ? or do i even need to pass it through the model if it is is not in the context of structure ?
ah, for this, people have used the learned embeddings of different tokens. Some existing techniques are here https://github.com/kjappelbaum/element-coder
@n0w0f did you ever give this a look, do you plan to still look into it?
I did not yet, but I think there can be lot of hidden insights there, and would love to followup
@kjappelbaum , In order to check the similarity between atoms , or do those
King - Queen = Man - Women
analysis I would like to embed individual atoms with models trained on different representation. This is as a follow up to see if composition or atoms means anything for smaller modelsFor slice and composition maybe i can keep atom as the first token and pad all other token, but for crystal-llm or cif_rep atoms usually comes in the later part of the representation , would keeping atom at the beginning work for these representations ?