ZaidQureshi / bam

BSD 2-Clause "Simplified" License
120 stars 28 forks source link

Multi-GPU control #35

Open zhanglizhi15 opened 1 month ago

zhanglizhi15 commented 1 month ago

The current situation is that each SSD can only be controlled by one GPU. Is it possible to achieve control and reading of one SSD by multiple GPUs?

msharmavikram commented 1 month ago

Yes. This is possible as long as the caches are independent and user requires to manage data consistency.

We haven't implemented this in BaM but lsm-gnn paper discussed how to do it for gnn workload. We are happy to take PR request someone designs a native implementation that can work on many applications.

zhanglizhi15 commented 1 month ago

Do you have a link to the lsm-gnn paper? I can't find the article online.

msharmavikram commented 1 month ago

It seems like the paper is not public. I am working with the author to understand when it will go public. I will update once I have more information.

zhanglizhi15 commented 1 month ago

Okay, I'm so looking forward to your work.

zhanglizhi15 commented 3 weeks ago

I have read the LSM-GNN paper and my question is whether multiple GPUs can simultaneously control one SSD, or whether each GPU controls a different SSD separately. I see that the GIDS Multi GPU code uses GPUs to separately control different SSDs.

shizuocheng commented 2 weeks ago

@zhanglizhi15 Hello, is lsm-gnn open source? Can you provide a link to the code?

zhanglizhi15 commented 2 weeks ago

The multi-gpu branch of this GitHub repository (https://github.com/jeongminpark417/GIDS.git). LSM-GNN is not yet open source, but the code is similar to it.

msharmavikram commented 2 weeks ago

@shizuocheng LSM-GNN is under review and hence it is not open source.

@zhanglizhi15 BaM by default does not support multi-GPU code access shared SSD. This is nontrivial as you must maintain data consistency when performing writes to the shared drive. However, if your workload is read-only, then it is bit easier to make a change and support it.

shizuocheng commented 2 weeks ago

The multi-gpu branch of this GitHub repository (https://github.com/jeongminpark417/GIDS.git). LSM-GNN is not yet open source, but the code is similar to it.

@zhanglizhi15 Thank you so much! However, I tried the GIDS multi-gpu branch and found that the GIDS_Loader interface is not provided in this branch. In the multi-gpu implementation, the source code of dgl.dataloading.DataLoader is directly modified, so it cannot be run directly. Can you please run through multiple gpus?

msharmavikram commented 2 weeks ago

@shizuocheng can we take these questions in the GIDS git repository? The issue you are highlighting requires discussion in that repository and not here.