Closed Xiang-cd closed 9 months ago
@Xiang-cd Thanks for raising this.
The Device
/ GpuProcess
instance is live object to get the latest metrics via method calls. E.g.:
device = Device(0)
device.gpu_utilization() # -> 96
time.sleep(1)
device.gpu_utilization() # -> 80 # the latest value from a new NVML API call
Once it is converted to snapshot, the metrics are forzen. The metrcis values are obtained at the time you call as_snapshot()
.
device = Device(0)
snapshot = device.as_snapshot()
snapshot.gpu_utilization # -> 94
time.sleep(1)
snapshot.gpu_utilization # -> 94 (always the freezed value)
You can access the device object via:
snapshot = device.as_snapshot()
snapshot.real # -> Device(...)
# Get a new snapshot
snapshot = snapshot.real.as_snapshot()
# or
snapshot = device.as_snapshot()
thank you, so the snapshot is not a context containing all memory content and register status that could move all context of one device to another device?
Required prerequisites
Questions
thank you for your great work! I've seen the as_snapshot function, but I was wondering how the snapshot could be used? is the snapshot resumable? because I didn't see the resume interface.