lihan97 / KPGT

codes for KPGT (Knowledge-guided Pre-training of Graph Transformer)
Apache License 2.0
84 stars 14 forks source link

how to get ../dataset/scaffold_0 or scaffold_1 or scaffold_2? #5

Closed iwanthappy closed 8 months ago

iwanthappy commented 8 months ago

I want to finetune in my datasets by the model. Do you have this code to get scaffold_0 or scaffold_1 or scaffold_2

lihan97 commented 8 months ago

We used the _scaffoldsplit function introduced in this page: https://github.com/tencent-ailab/grover/blob/main/grover/util/utils.py

iwanthappy commented 8 months ago

我们使用了本页介绍的_scaffold_split_功能:https://github.com/tencent-ailab/grover/blob/main/grover/util/utils.py

hello, I want to know how to save the finetune model?

iwanthappy commented 8 months ago

我们使用了本页介绍的_scaffold_split_功能:https://github.com/tencent-ailab/grover/blob/main/grover/util/utils.py The last question, I'd like to ask you how you did your predictions on the molecules in the FDA dataset, you don't seem to have a code for that, do you have a code for that?

lihan97 commented 8 months ago

我们使用了本页介绍的_scaffold_split_功能:https://github.com/tencent-ailab/grover/blob/main/grover/util/utils.py

hello, I want to know how to save the finetune model?

you can modify the fit function in KPGT/src/trainer/finetune_trainer.py as follows to save the finetuned model:

def fit(self, model, train_loader, val_loader, test_loader): best_val_result,best_test_result,best_train_result = self.result_tracker.init(),self.result_tracker.init(),self.result_tracker.init() best_epoch = 0 for epoch in range(1, self.args.n_epochs+1): if self.ddp: train_loader.sampler.set_epoch(epoch) self.train_epoch(model, train_loader, epoch) if self.local_rank == 0: val_result = self.eval(model, val_loader) test_result = self.eval(model, test_loader) train_result = self.eval(model, train_loader) if self.result_tracker.update(np.mean(best_val_result), np.mean(val_result)): best_val_result = val_result best_test_result = test_result best_train_result = train_result best_epoch = epoch torch.save(model.state_dict(), "your path") if epoch - best_epoch >= 20: break return best_train_result, best_val_result, best_test_result

lihan97 commented 8 months ago

我们使用了本页介绍的_scaffold_split_功能:https://github.com/tencent-ailab/grover/blob/main/grover/util/utils.py The last question, I'd like to ask you how you did your predictions on the molecules in the FDA dataset, you don't seem to have a code for that, do you have a code for that?

To make predictions for the FDA dataset, you can simply add this predict function in KPGT/src/trainer /finetune_trainer.py: def predict(self, model, dataloader):

model: your finetuned model

# dataloader: FDA dataloader
model.eval()
predictions_all = []

for batched_data in dataloader:
    predictions, _ = self._forward_epoch(model, batched_data)
    predictions_all.append(predictions.detach().cpu())
return torch.cat(predictions_all)
iwanthappy commented 8 months ago

Thank you for your suggestions.But I don't know how to call the code for this prediction, whether it's evaulation.py or finetune.py?Or instantiate an object to make predictions directly?

940118471 @.***

 

------------------ 原始邮件 ------------------ 发件人: "lihan97/KPGT" @.>; 发送时间: 2024年1月20日(星期六) 晚上10:03 @.>; @.**@.>; 主题: Re: [lihan97/KPGT] how to get ../dataset/scaffold_0 or scaffold_1 or scaffold_2? (Issue #5)

我们使用了本页介绍的_scaffold_split_功能:https://github.com/tencent-ailab/grover/blob/main/grover/util/utils.py The last question, I'd like to ask you how you did your predictions on the molecules in the FDA dataset, you don't seem to have a code for that, do you have a code for that?

To make predictions for the FDA dataset, you can simply add this predict function in KPGT/src/trainer /finetune_trainer.py: def predict(self, model, dataloader):

model: your finetuned model

dataloader: FDA dataloader

model.eval() predictions_all = [] for batcheddata in dataloader: predictions, = self._forward_epoch(model, batched_data) predictions_all.append(predictions.detach().cpu()) return torch.cat(predictions_all)
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

lihan97 commented 8 months ago

You can modify finetune.py as follows:

trainer = Trainer(args, optimizer, lr_scheduler, loss_fn, evaluator, result_tracker, summary_writer, device=device,model_name='LiGhT', label_mean=train_dataset.mean.to(device) if train_dataset.mean is not None else None, label_std=train_dataset.std.to(device) if train_dataset.std is not None else None) best_train, best_val, best_test = trainer.fit(model, train_loader, val_loader, test_loader) predictions = trainer.predict(model, fda_dataloader)

iwanthappy commented 7 months ago

I've tried a lot of times, maybe my coding skills are too weak, and it still doesn't work. What I want to do is fine-tune the model on my own data, and then I want to use the fine-tuned model to do the classification task and predict the activity of unknown molecules. However, my understanding is that fintune.py code can only be implemented to save the fine-tuned model, and whether it is necessary to write code other than the fine-tuned model to do classification tasks. Have you done the sorting task for your follow-up?

940118471 @.***

 

------------------ 原始邮件 ------------------ 发件人: "lihan97/KPGT" @.>; 发送时间: 2024年1月22日(星期一) 晚上11:57 @.>; @.**@.>; 主题: Re: [lihan97/KPGT] how to get ../dataset/scaffold_0 or scaffold_1 or scaffold_2? (Issue #5)

You can modify finetune.py as follows:

trainer = Trainer(args, optimizer, lr_scheduler, loss_fn, evaluator, result_tracker, summary_writer, device=device,model_name='LiGhT', label_mean=train_dataset.mean.to(device) if train_dataset.mean is not None else None, label_std=train_dataset.std.to(device) if train_dataset.std is not None else None) best_train, best_val, best_test = trainer.fit(model, train_loader, val_loader, test_loader) predictions = trainer.predict(model, fda_dataloader)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

iwanthappy commented 7 months ago

Happy New Year! I would like to use KPGT to do a virtual screening of mPGES-1 targets, just like the discovery of potential inhibitors using KPGT as a target in the article. I've fine-tuned my data and saved the model, but I have a problem with the next step of the classification task, I used the code you told me before, but it still fails, can you please give me some more advice, or if there is any other contact for further communication.

940118471 @.***

 

------------------ 原始邮件 ------------------ 发件人: "lihan97/KPGT" @.>; 发送时间: 2024年1月20日(星期六) 晚上10:03 @.>; @.**@.>; 主题: Re: [lihan97/KPGT] how to get ../dataset/scaffold_0 or scaffold_1 or scaffold_2? (Issue #5)

我们使用了本页介绍的_scaffold_split_功能:https://github.com/tencent-ailab/grover/blob/main/grover/util/utils.py The last question, I'd like to ask you how you did your predictions on the molecules in the FDA dataset, you don't seem to have a code for that, do you have a code for that?

To make predictions for the FDA dataset, you can simply add this predict function in KPGT/src/trainer /finetune_trainer.py: def predict(self, model, dataloader):

model: your finetuned model

dataloader: FDA dataloader

model.eval() predictions_all = [] for batcheddata in dataloader: predictions, = self._forward_epoch(model, batched_data) predictions_all.append(predictions.detach().cpu()) return torch.cat(predictions_all)
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>