Closed gtamer2 closed 7 months ago
I have studied the pretrain/downstream scripts' way of initializing a dataset as here: https://github.com/DeepGraphLearning/GearNet/blob/780809836c87c1028b312241215e856d9b0634b2/script/pretrain.py#L65, but from studying the Torchdrug source code, this method is specific to TorchDrug-registered datasets.
Is the solution to load the PDB files as HDF5 files like gearnet/dataset.py
is doing here: https://github.com/DeepGraphLearning/GearNet/blob/780809836c87c1028b312241215e856d9b0634b2/gearnet/dataset.py#L44
and to pass in the Gearnet graph transformation as a parameter here:
https://github.com/DeepGraphLearning/GearNet/blob/780809836c87c1028b312241215e856d9b0634b2/gearnet/dataset.py#L94 ?
When I try this, I get an error ERRROR: OSError: Unable to open file (file signature not found)
, and I'm not sure how to convert a PDB file to HDF5 format.
I have tried with
_protein = data.Protein.pack([protein])
protein_ = graph_construction_model(_protein)
as described in the tutorials and had no issue at all
This fixed it for me. Not sure why I missed that option. Thanks!
Hello,
I am getting errors that are blocking me from running GearNet inference on an input PDB file.
First, I loaded a PDB file into a TorchDrug.data.Protein structure. Second, I followed the GearNet graph construction laid out in TorchProtein tutorial 3: Structure-based Protein Property Prediction. I encapsulated the graph construction logic in a function
However, when running these two steps:
I get the following error:
I studied the source code and found that
num_cum_residues
is a property ofTorchDrug.data.PackedProtein
but not of fortorchdrug.data.Protein
.So, third, I attempted to convert Protein into PackedProtein, with resulting code:
However, now I get an error that
ValueError: Expect node attribute
atom_typeto have shape (16344, *), but found torch.Size([16448])
(16448 is, I assume, the number of nodes derived from the edge_list).Is this the right approach to run inference with Gearnet? I downloaded the PDB files directly from https://www.wwpdb.org, so I'd like to think the issue is not in the input data. Thank you in advance for any guidance here.
Example PDB files that can't be processed: