This repository contains implementation of DUCATI & SOTA and the overall training scripts. We put the underlying implementations of some APIs in a customized version of DGL (https://github.com/initzhang/dc_dgl.git).
Please follow these steps to prepare environment and datasets:
requirements.txt
preprocess
directoryThen we can run the experiments under different settings as follows:
# verify dual cache allocation plan of DUCATI
CUDA_VISIBLE_DEVICES=0 python run_allocate.py --dataset [DS] --fanouts [FS] --fake-dim [FD] --total-budget [TB]
# verify iteration time of DUCATI using the allocation plan above
CUDA_VISIBLE_DEVICES=0 python run_ducati.py --dataset [DS] --fanouts [FS] --fake-dim [FD] --adj-budget [AB] --nfeat-budget [NB]
# verify iteration time of SOTA
CUDA_VISIBLE_DEVICES=0 python run_sota.py --dataset [DS] --fanouts [FS] --fake-dim [FD] --nfeat-budget [NB]
The detailed cache configurations used in the paper:
While it is possible to determine the
total budget
inside scripts on the fly (run a normal training script without cache, print thetorch.cuda.max_memory_allocated()
after training, then substract the printed value from total GPU memory), we find that it is more flexible to exposetotal budget
as a configurable knob during our experiments.