v6d-io / v6d

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
https://v6d.io
Apache License 2.0
816 stars 117 forks source link

Improve the implementation of migrate object #1905

Closed dashanji closed 2 weeks ago

dashanji commented 1 month ago

Describe your problem

The current implementation of migrate object API has two issues:

1. The name of the old object is not bound to the new migrated object.

- When call client.get(name="data", fetch=True), the migration will always trigger even if the object has been migrated.

  1. The migration will be executed repeatedly when there are multiple requests.
  1. When migrate an object with lots of blobs(e,g Torch.module), the performance is too bad.

Start with two vineyard instances locally: Instance 1: ./bin/vineyardd --etcd_endpoint=http://127.0.0.1:33774 --socket=/tmp/vineyard-local.sock --rpc_socket_port=9611 --compression=false --reserve_memory=true --size=20Gi Instance 2: ./bin/vineyardd --etcd_endpoint=http://127.0.0.1:33774 --socket=/tmp/vineyard-local1.sock --rpc_socket_port=9612 --compression=false --reserve_memory=true --size=20Gi

Test it with the following code.

import safetensors
import safetensors.torch
import torch
import vineyard
import time
from vineyard.contrib.ml.torch import torch_context

state_file = "stable-diffusion-v1-5-pruned-emaonly.safetensors"
with open(state_file, 'rb') as f:
    state_dict = safetensors.torch.load(f.read())

client = vineyard.connect("/tmp/vineyard-local.sock")
with torch_context(client):
    obj = client.put(state_dict, persist=True, name="state_dict")

client2 = vineyard.connect("/tmp/vineyard-local1.sock")
start = time.time()
with torch_context(rpc_client):
    data1 = client2.get(name="state_dict", fetch=True)
end = time.time()
print(f"Migrate Time taken: {end - start} seconds.")

start = time.time()
rpc_client = vineyard.connect(host="127.0.0.1", port=9611)
with torch_context(rpc_client):
    data2 = rpc_client.get(name="state_dict")
end = time.time()

assert len(state_dict) == len(data1), "State dict and data1 have different lengths"
assert len(state_dict) == len(data2), "State dict and data2 have different lengths"

for k, v in state_dict.items():
    assert torch.equal(v, data1[k]), f"Tensors for data1 key {k} do not match."
    assert torch.equal(v, data2[k]), f"Tensors for data2 key {k} do not match."

print(f"RPC Time taken: {end - start} seconds.")

print("All operations are completed.")

The result is as follows.

Migrate Time taken: 12.88494062423706 seconds.
RPC Time taken: 1.898338794708252 seconds.
All operations are completed.

Based profile, we find there several issues in the current recursive implementation of ReceiveRemoteBuffers

  1. When migrate an object with lots of blobs, the callback is too deep, which causes the frequent context switch.
  2. The lambda object captures also increase when it's in a deep recursive, which causes the frequent memory alloc and free.

Ideas:

1. Put the same name again for the new migrated object.

  1. Record a migration map with objected as keys in the vineyard server. It is guaranteed that for the same objectid, there will be only one migration operation at most.
  2. We can use a queue to replace the recursive implementation and record all lambda objects in a shared struct.
github-actions[bot] commented 3 weeks ago

/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.