NVIDIA / aistore

AIStore: scalable storage for AI applications
https://aistore.nvidia.com
MIT License
1.21k stars 160 forks source link

Hotplugging of volumes in an air-gapped deployment #150

Closed compiaffe closed 11 months ago

compiaffe commented 1 year ago

I have a use case where aistore acts as a storage backend for an air-gapped system. One requirement is to walk the hard drive to transfer data from the air-gapped location to some other place.

I was wondering if there is a sane way to support simply disconnecting a hard drive and moving it to another location, reading data from it without the use of aistore and moving it back to the air-gapped ais instance?

The main blocker right now is this issue: https://github.com/NVIDIA/aistore/issues/140 If I simply plug out the hard drive, plug it back in and restart ais I see that issue. I'm wondering if there is a way to recover from that. (But let's keep that discussion in 140)

alex-aizman commented 1 year ago

"simply disconnecting" is not so simple if we want the cluster to remain online and operational in the intervening time... but anyway, in AIS a local drive is abstracted as "mountpath", and attaching/detaching mountpaths is fully supported:

$ ais storage mountpath --help
compiaffe commented 1 year ago

Thanks for that clarification. Can one configure ais such that the external hard drive is simply used as a mirror instead of deciding to place partial buckets on it?

Should that be done with an appropriate mirror config or maybe with an additional target instance?

alex-aizman commented 1 year ago

In AIS, all drives are equal and equally utilized, and so - No, you cannot have a given drive "used as a mirror instead of, etc." This sounds like a very special logistics you have... so maybe a separate AIS cluster can help and a copy (ais cp) in-between, not sure.