rampasek / GraphGPS

Recipe for a General, Powerful, Scalable Graph Transformer
MIT License
643 stars 114 forks source link

Concerns about time consumption of function graphormer_pre_processing() #43

Closed Data-reindeer closed 9 months ago

Data-reindeer commented 9 months ago

I tried to use Graphformer to process some much smaller dataset, such as Clintox in MoleculeNet, than ZINC. I adopted the configuration provided in configs/Graphormer/zinc-Graphormer.yaml. However, I found that the time comsumption of running function _graphormer_preprocessing() is too long. It tooks about 5 mins to process 256 molecules (~6000 nodes) on CPU.

The main time conception is in this step:

graph_index = torch.empty(2, N ** 2, dtype=torch.long)

for i in tqdm(range(N)):
  for j in range(N):
    graph_index[0, i * N + j] = i
    graph_index[1, i * N + j] = j

Is this normal? Or could it be something wrong with my side?

Data-reindeer commented 9 months ago

It was my oversight. As stated in the annotations in _graphormer_preprocessing(), all of the pre-processing here is based on a single graph. I did the preprocessing at the batch level, which resulted in a long-time running.

I will close this issue.