Too slow to generate samples

facebookresearch / crystal-text-llm

Large language models to generate stable crystals.

Other

75 stars 13 forks source link

Too slow to generate samples #8

Open dqgdqg opened 5 months ago

dqgdqg commented 5 months ago

Dear authors,

Thanks to your excellent work and released codes.

I've already successfully fine-tuned the 7B model with your llama_finetune.py. However, when I tried to get samples using llama_sample.py, I found the speed was extremely slow and unacceptable. It is around 4 mins per sample.

We have ~9000 samples in the test, which means we have to take 9000*4/60=600 hours to get all samples.

Is it normal? Or can we have some tricks to speedup the sampling?

Thanks again!

ngruver commented 5 months ago

Hello,

Is this slow rate the result of many samples from the model being rejected by the filtering step (to make sure the sampled string can be parsed to a crystal)?

Beyond improving the rate of valid strings, batched sampling and parallelizing over multiple GPUs can also be used to accelerate the process. The default batch size in the code is 1 and some changes actually need to be made to support larger batch sizes (namely left padding of the prompt input ids).

dqgdqg commented 5 months ago

Thanks for your reply! I'm debugging the parameters to make it faster.

Meanwhile, I would like to know the normal speed of the sampling in your experiments. How long does it take for the whole ~9000 samples? and on what hardware (GPUs, memory, etc)?

Thanks!

shuyijia commented 5 months ago

I am able to train a 7B model on the MP-20 dataset. On a single A100 40GB GPU with batch size = 40, I am able to generate 10000 structures in ~160 mins. Out of the 10000 generated cif strings, 9671 can be successfully parsed into cifs.

dqgdqg commented 3 months ago

I am able to train a 7B model on the MP-20 dataset. On a single A100 40GB GPU with batch size = 40, I am able to generate 10000 structures in ~160 mins. Out of the 10000 generated cif strings, 9671 can be successfully parsed into cifs.

Thanks for sharing the data points!