Open hsinghuan opened 3 months ago
Thank you for your patience. Sorry for the late response.
Your understanding is correct. For the case where node_feat
is not present in data.keys()
, our code indeed pads node_raw_features
twice. Normally, we only need to pad an additional node (for sequence or neighbor completion). However, this redundant padding does not affect the model performance because the padded features have no actual meaning and are all zeros. Additionally, the node indices used to index node_raw_features
will not cause an out-of-bounds issue.
If you still wish to address this issue, you can modify the line node_raw_features = np.zeros((num_nodes + 1, 1))
to node_raw_features = np.zeros((num_nodes, 1))
.
Thank you for pointing this out!
Thank you for the effort in creating DyGLib and adapting it to TGB! I have a question regarding line 151 to line 161 in utils/DataLoader.py
It seems like node_raw_features would be padded twice if 'node_feat' is not in data.keys(). The first time is in
np.zeros((num_nodes + 1, 1))
while the second time is innp.vstack([np.zeros(node_raw_features.shape[1])[np.newaxis, :], node_raw_features])
. This makes the length of the 0-th dimension of node_raw_features greater than the number of unique nodes by 2. I am not sure if this is intended or not. Please let me know if I missed something. Thanks!