Open dqgdqg opened 5 months ago
Hello,
Is this slow rate the result of many samples from the model being rejected by the filtering step (to make sure the sampled string can be parsed to a crystal)?
Beyond improving the rate of valid strings, batched sampling and parallelizing over multiple GPUs can also be used to accelerate the process. The default batch size in the code is 1 and some changes actually need to be made to support larger batch sizes (namely left padding of the prompt input ids).
Thanks for your reply! I'm debugging the parameters to make it faster.
Meanwhile, I would like to know the normal speed of the sampling in your experiments. How long does it take for the whole ~9000 samples? and on what hardware (GPUs, memory, etc)?
Thanks!
I am able to train a 7B model on the MP-20 dataset. On a single A100 40GB GPU with batch size = 40, I am able to generate 10000 structures in ~160 mins. Out of the 10000 generated cif strings, 9671 can be successfully parsed into cifs.
I am able to train a 7B model on the MP-20 dataset. On a single A100 40GB GPU with batch size = 40, I am able to generate 10000 structures in ~160 mins. Out of the 10000 generated cif strings, 9671 can be successfully parsed into cifs.
Thanks for sharing the data points!
Dear authors,
Thanks to your excellent work and released codes.
I've already successfully fine-tuned the 7B model with your
llama_finetune.py
. However, when I tried to get samples usingllama_sample.py
, I found the speed was extremely slow and unacceptable. It is around 4 mins per sample.We have ~9000 samples in the test, which means we have to take 9000*4/60=600 hours to get all samples.
Is it normal? Or can we have some tricks to speedup the sampling?
Thanks again!