Closed biochunan closed 1 week ago
Hi @biochunan I think you need to check if batch.batch
is set correctly prior to computing the edges.
Hi @a-r-j, thanks for your reply! That's indeed the cause of the problem. After manually adding an attribute that the collate function can can_infer_num_nodes
, and it solved the issue.
Hello, I really like the
proteinworkshop
codebase. It's really handy to load and featurize structures.I tried to featurize my protein batches but found the
batch.edge_index
contains edges across multiple differentProtein
data objects in my batch. In my use case, the edges should be within each Data object. I've included a simplified example of my current usage. Could you give me some guidance on how to avoid this?https://github.com/a-r-j/ProteinWorkshop/blob/61294d4bafab7779121cf4eaa4435742b61b709a/proteinworkshop/features/factory.py#L112
Each data object, in this case,
1a14
, has only 600 residues. However, in the batch.edge_index, there are edges between residues from different data objects. For example, the edge[595, 1199]
denotes an edge between the 596th residue from the firstProtein
and the 1199th residue, i.e., the 600th residue in the secondProtein
. I guess this may not be the correct way of using the featurizer, but would appreciate it a lot if you could give an example of its usage in this case.