Open mfbalin opened 5 days ago
Based on the rules, feature caching of any form is not allowed.
@drcanchi can you please review GraphBolt's caching and comment on whether this is any different and whether it can be used?
@ShriyaPalsamudram why is such caching not allowed? Both CPU and GPU memory hierarchies are made of multiple levels and caching is prevalently used to make anything run fast in hardware.
In our case, we treat GPU memory as a cache for the CPU memory which is a cache for the SSD storage.
The reason to disallow faature caching is to make the benchmark representative of real-world GNN workloads which typically work on much much larger datasets (and features). Because we couldn't access an open-sourced dataset that matches in size, we had to settle for a smaller one but make the benchmark be as representative as possible.
The reason to disallow faature caching is to make the benchmark representative of real-world GNN workloads which typically work on much much larger datasets (and features). Because we couldn't access an open-sourced dataset that matches in size, we had to settle for a smaller one but make the benchmark be as representative as possible.
Even when the dataset is large, caching will be employed to extract maximum performance from the underlying hardware. I guess we will have to make a submission in the open category to showcase what our software is capable of. Will any future submission utilizing caching qualify for the open division?
https://github.com/mlcommons/training_policies/blob/master/training_rules.adoc#14-appendix-benchmark-specific-rules
Here, it is stated that feature caching is not allowed. What is the definition of feature caching?
We are preparing to make a submission using the GraphBolt GNN dataloader and our framework has support for feature and graph caching on GPUs with no redundancy across GPUs. We also support caching in the system memory so I am wondering whether I can utilize any of these components for a valid closed MLPerf submission for gnn node classification.
GraphBolt's caching facilities: https://www.dgl.ai/dgl_docs/generated/dgl.graphbolt.CPUCachedFeature.html https://www.dgl.ai/dgl_docs/generated/dgl.graphbolt.GPUCachedFeature.html