facebookresearch / crystal-text-llm

Large language models to generate stable crystals.
Other
67 stars 12 forks source link

Missing data #9

Open Leezekun opened 5 months ago

Leezekun commented 5 months ago

Hi,

Thank you for sharing the code and for the great work.

However, when I ran the code, I noticed a missing file: "/private/home/ngruver/ocp-modeling-dev/llm/2023-07-13-mp-computed-structure-entries.json.gz" referenced in Line 117 of the e_above_hull.py file. Could you please provide this data file or instructions on how to obtain it?

Additionally, could you provide detailed instructions on how to reproduce your experiments for conditioned generation and infilling? Currently, there seems to be instructions only for the unconditional generation experiment.

Thank you in advance for your help.

ngruver commented 3 months ago

Hello,

Apologies for the delay and for the missing file. As I no longer have access to the meta file system, I can't locate the exact file, but here are very similar files that should facilitate a reasonable comparison: https://figshare.com/articles/dataset/Matbench_Discovery_v1_0_0/22715158. These files correspond to convex hulls for nearby dates of the materials project database.

To reproduce the conditioned generation and infilling results, you can use the files in https://github.com/facebookresearch/crystal-text-llm/tree/main/data/with_tags in tandem with the following two functions: (1) https://github.com/facebookresearch/crystal-text-llm/blob/main/llama_sample.py#L177 (2) https://github.com/facebookresearch/crystal-text-llm/blob/main/llama_sample.py#L245. The hyperparameters used for sampling should be available in the paper. You can try temperature in {0.7, 1.0} and nucleus size in {0.7, 1.0} as reasonable values if not otherwise listed.

If you are particularly interested in one application (for example text-conditional generation or infilling in isolation), you can also turn off the stochastic prompt sampling and just train a model specifically for a single task, which might lead to faster convergence.

Nate

dqgdqg commented 2 months ago

Were you able to successfully reproduce the script e_above_hull.py?

I encountered several issues when attempting to run it. I have detailed these problems in a new issue here: https://github.com/facebookresearch/crystal-text-llm/issues/12#issue-2407536949.

I would greatly appreciate any hints or guidance you can provide.

Thank you!