Closed yinkaaiwu closed 1 year ago
I made some modifications to your code so that it can use the GPU as the device. Below is the part I changed in the 'train_agent.py' file. Additionally, I only made changes here.
def move_data_to_device(data, device):
if data is None:
return None
for key, value in data.items():
if isinstance(value, torch.Tensor):
data[key] = value.to(device)
return data
class Agent(object):
def __init__(self, train_data, valid_data, scale_const, model_path, test_data=None, layer_nodes=[10, 10],
activation=['tanh', 'tanh'], lr=1, max_iter=20, history_size=100, device=torch.device('cuda:0')):
"""
scale_const: energy scaling factor
layer_nodes: list of int, # of nodes in the each layer
activation: str, "tanh", "Sigmoid" or "relu"
lr, max_iter and history_size: float, int, int, parameters for LBFGS optimization method in pytorch
device: torch.device, cpu or cuda
"""
n_element = train_data['b_e_mask'].size(2)
n_fp = train_data['b_fp'].size(2)
self.train_data = move_data_to_device(train_data, device)
self.valid_data = move_data_to_device(valid_data, device)
self.test_data = test_data
self.scale_const = scale_const
self.model = BPNN(n_fp, layer_nodes, activation, n_element).to(device)
self.optimizer = torch.optim.LBFGS(self.model.parameters(), lr=lr, max_iter=max_iter,
history_size=history_size, line_search_fn='strong_wolfe')
self.model_path = model_path
I am not sure why it would take longer with Cuda. We have moved on to using OCP (https://github.com/Open-Catalyst-Project/ocp) models instead of this. See Musielewicz, J., Wang, X., Tian, T., & Ulissi, Z. (2022). Finetuna: fine-tuning accelerated molecular simulations. Machine Learning: Science and Technology, 3(3), 03–01. http://dx.doi.org/10.1088/2632-2153/ac8fe0 for the approach we recommend instead.
I am not sure why it would take longer with Cuda. We have moved on to using OCP (https://github.com/Open-Catalyst-Project/ocp) models instead of this. See Musielewicz, J., Wang, X., Tian, T., & Ulissi, Z. (2022). Finetuna: fine-tuning accelerated molecular simulations. Machine Learning: Science and Technology, 3(3), 03–01. http://dx.doi.org/10.1088/2632-2153/ac8fe0 for the approach we recommend instead.
Thank you for you reply, I Think using SGD or Adam is better than LBFGS on cuda. I don't know why, but my test shows such results.
Here is the relax_log.txt when I running demonstration.ipynb
Although the number of steps is different, it's clear that when device=CPU, both training time and NN relaxation time are much lower than when device=cuda. I'd like to ask for your insights on the possible reasons for this. Do you have any optimization ideas or relevant demos? Thank you.
CPU: Step 0: get groud truth data Step 0: groud truth data calculation done [2.411, 2.71, 2.411, 2.673, 2.817, 2.713, 2.583, 2.455, 2.356, 2.481] Step 0: start training Step 0: training done, time: 65.63558578491211 s Step 0: start NN relaxation Step 0: NN relaxation done, time: 31.031465530395508 s
Step 1: get groud truth data Step 1: groud truth data calculation done max force for each configuration: [0.82, 0.876, 0.821, 0.884, 1.108, 0.701, 1.04, 0.65, 0.682, 1.071] Step 1: start training Step 1: training done, time: 146.79284381866455 s Step 1: start NN relaxation Step 1: NN relaxation done, time: 24.376449584960938 s
Step 2: get groud truth data Step 2: groud truth data calculation done max force for each configuration: [0.049, 0.265, 0.033, 0.257, 0.313, 0.029, 0.271, 0.034, 0.03, 0.316] Step 2: start training Step 2: training done, time: 126.58962988853455 s Step 2: start NN relaxation Step 2: NN relaxation done, time: 13.029428720474243 s
Step 3: get groud truth data Step 3: groud truth data calculation done max force for each configuration: [0.037, 0.031, 0.032, 0.033, 0.029]
GPU: Step 0: get groud truth data Step 0: groud truth data calculation done max force for each configuration: [2.411, 2.71, 2.411, 2.673, 2.817, 2.713, 2.583, 2.455, 2.356, 2.481] Step 0: start training Step 0: training done, time: 191.37967467308044 s Step 0: start NN relaxation Step 0: NN relaxation done, time: 47.26152324676514 s
Step 1: get groud truth data Step 1: groud truth data calculation done max force for each configuration: [0.419, 0.509, 0.659, 0.638, 0.925, 0.406, 0.962, 0.493, 0.336, 0.711] Step 1: start training Step 1: training done, time: 345.57629013061523 s Step 1: start NN relaxation Step 1: NN relaxation done, time: 60.44816279411316 s
Step 2: get groud truth data Step 2: groud truth data calculation done max force for each configuration: [0.038, 0.031, 0.029, 0.034, 0.033, 0.044, 0.034, 0.033, 0.027, 0.041]