Open Skeleton003 opened 1 month ago
To trigger regression tests:
@dgl-bot run [instance-type] [which tests] [compare-with-branch]
;
For example: @dgl-bot run g4dn.4xlarge all dmlc/master
or @dgl-bot run c5.9xlarge kernel,api dmlc/master
@dgl-bot
Do you have a benchmark comparing the new approach to the old one for different K values?
@mfbalin @frozenbugs See benchmark results in the description. The new implementation does not seem to be as efficient as we thought. Maybe we should keep it as is?
@mfbalin @frozenbugs See benchmark results in the description. The new implementation does not seem to be as efficient as we thought. Maybe we should keep it as is?
Let me take a look at the code to see if we missed anything. Thank you for the benchmark.
Description
First let's take a look at the current code for indexing a HeteroItemSet (occurs in
HeteroItemSet.__getitem__
):Say the length of indices is
N
and the number of etypes/ntyeps isK
, then the time complexity of current implementation of indexing a dictionary isO(N * K)
, which is mainly introduced by the lineIf there are a lot of etypes, this line could easily become the bottleneck.
This draft PR intends to propose an alternative to current logic:
whose time complexity is
O(N * logN)
where thelog
is introduced by the sorting operation.This will imporve the performance when there are many etypes, but might cause more time consuming when there are few etypes. A thoughtful consideration lies in striking a balance between the two approaches.
Update on June 18
Benchmark: https://docs.google.com/document/d/1Bbmp8gMekiGIYYxEMVbmXSANRZlZ_nTNbhpWul4RaKA/edit?usp=sharing
The results show that the original algorithm is faster than the new algorithm (theoretical time complexity N*logN) for almost all batch_size and num_types.
Checklist
Please feel free to remove inapplicable items for your PR.
Changes