Closed dashanji closed 1 week ago
The synchronization is by-design to avoid "put but cannot get" problem.
The synchronization is by-design to avoid "put but cannot get" problem.
Thanks for the details. The current get
function will get the metadata first (https://github.com/v6d-io/v6d/blob/main/python/vineyard/core/client.py#L577), maybe we can add a parameter in the GetObject to avoid getting the metadata twice. What do you think?
In addition, currently it seems that the implementation of python resolver does take a lot of time. Do you have any suggestions? Thanks!
./bin/vineyardd --socket /tmp/vineyard_test.sock --meta local
import vineyard
import numpy as np
client = vineyard.connect('/tmp/vineyard_test.sock')
data = np.arange(100000)
oid = client.put(data)
import time
s_time = time.time()
client._ipc_client.get(oid)
e_time = time.time()
elapsed_time = e_time - s_time
print(f"Average time: {elapsed_time} seconds")
Use the line_profiler (https://github.com/pyutils/line_profiler) to profile the python code.
$ kernprof -lv ../test.py
Average time: 0.0005114078521728516 seconds
Wrote profile results to test.py.lprof
Timer unit: 1e-06 s
Total time: 0.000226283 s
File: /opt/caoye/v6d/python/vineyard/core/resolver.py
Function: run at line 58
Line # Hits Time Per Hit % Time Line Contents
==============================================================
58 @profile
59 def run(self, obj: Any, **kw):
60 1 5.6 5.6 2.5 typename = obj.meta.typename
61 1 16.7 16.7 7.4 prefix, resolver = find_most_precise_match(typename, self._factory)
62 1 0.9 0.9 0.4 vineyard_client = kw.pop('__vineyard_client', None)
63 1 0.3 0.3 0.1 if prefix:
64 1 53.3 53.3 23.6 resolver_func_sig = inspect.getfullargspec(resolver)
65 1 0.6 0.6 0.3 if resolver_func_sig.varkw is not None:
66 value = resolver(obj, resolver=self, **kw)
67 else:
68 1 0.1 0.1 0.1 try:
69 # don't pass the `**kw`.
70 1 0.6 0.6 0.3 if 'resolver' in resolver_func_sig.args:
71 value = resolver(obj, resolver=self)
72 else:
73 1 134.7 134.7 59.5 value = resolver(obj)
74 except Exception as e:
75 raise RuntimeError( # pylint: disable=raise-missing-from
76 'Unable to construct the object using resolver: '
77 'typename is %s, resolver is %s' % (obj.meta.typename, resolver)
78 ) from e
79 1 0.3 0.3 0.1 if value is None:
80 # if the obj has been resolved by pybind types, and there's no proper
81 # resolver, it shouldn't be an error
82 if type(obj) is not Object: # pylint: disable=unidiomatic-typecheck
83 return obj
84
85 # we might `client.put(None)`
86 return None
87
88 # associate a reference to the base C++ object
89 1 0.1 0.1 0.1 try:
90 1 1.1 1.1 0.5 setattr(value, '__vineyard_ref', obj)
91 1 0.4 0.4 0.2 setattr(value, '__vineyard_client', vineyard_client)
92
93 # register methods
94 1 3.3 3.3 1.4 from vineyard.core.driver import get_current_drivers
95
96 1 8.0 8.0 3.5 get_current_drivers().resolve(value, obj.typename)
97 except AttributeError:
98 pass
99
100 1 0.2 0.2 0.1 return value
101 # keep it as it is
102 return obj
Total time: 1.934e-06 s
File: /opt/caoye/v6d/python/vineyard/core/resolver.py
Function: get_current_resolvers at line 123
Line # Hits Time Per Hit % Time Line Contents
==============================================================
123 @profile
124 def get_current_resolvers() -> ResolverContext:
125 '''Obtain current resolver context.'''
126 1 1.2 1.2 64.4 default_resolver = getattr(_resolver_context_local, 'default_resolver', None)
127 1 0.5 0.5 26.4 if not default_resolver:
128 default_resolver = default_resolver_context.extend()
129 1 0.2 0.2 9.2 return default_resolver
Total time: 0.000477614 s
File: /opt/caoye/v6d/python/vineyard/core/resolver.py
Function: get at line 194
Line # Hits Time Per Hit % Time Line Contents
==============================================================
194 @profile
195 def get(
196 client,
197 object_id: Optional[ObjectID] = None,
198 name: Optional[str] = None,
199 resolver: Optional[ResolverContext] = None,
200 fetch: bool = False,
201 **kwargs
202 ):
203 """Get vineyard object as python value.
204
205 .. code:: python
206
207 >>> arr_id = vineyard.ObjectID('00002ec13bc81226')
208 >>> arr = client.get(arr_id)
209 >>> arr
210 array([0, 1, 2, 3, 4, 5, 6, 7])
211
212 Parameters:
213 client: IPCClient or RPCClient
214 The vineyard client to use.
215 object_id: ObjectID
216 The object id that will be obtained from vineyard.
217 name: ObjectID
218 The object name that will be obtained from vineyard, ignored if
219 ``object_id`` is not None.
220 resolver:
221 When retrieving vineyard object, an optional *resolver* can be specified.
222 If no resolver given, the default resolver context will be used.
223 fetch:
224 Whether to trigger a migration when the target object is located on
225 remote instances.
226 kw:
227 User-specific argument that will be passed to the builder.
228
229 Returns:
230 A python object that return by the resolver, by resolving an vineyard object.
231 """
232 # wrap object_id
233 1 0.7 0.7 0.1 if object_id is not None:
234 1 2.2 2.2 0.5 if isinstance(object_id, (int, str)):
235 object_id = ObjectID(object_id)
236 elif name is not None:
237 object_id = client.get_name(name)
238
239 1 218.8 218.8 45.8 obj = client.get_object(object_id)
240 1 0.3 0.3 0.1 if resolver is None:
241 1 11.2 11.2 2.4 resolver = get_current_resolvers()
242 1 244.4 244.4 51.2 return resolver(obj, __vineyard_client=client, **kwargs)
/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.
You could add a new option instead, rather than change the default behavior, as lack of the sync may cause new bugs.
You could add a new option instead, rather than change the default behavior, as lack of the sync may cause new bugs.
Any further comment? @dashanji
You could add a new option instead, rather than change the default behavior, as lack of the sync may cause new bugs.
Any further comment? @dashanji
I will add it, as we can construct the object from the meta. https://github.com/v6d-io/v6d/pull/1893
Please add default value to python docstrings.
Please add default value to python docstrings.
Done.
Please make sure you are using consistent true/false for default values, and make sure the default value yields the same behavior as the previous settings (before this pr).
What do these changes do?
Reproduce Steps
When we connect to an external vineyardd with etcd metadata, the latency is high as we'll sync the metadata of an object from the metadata service even if it's a local one.
Original Result
Result with this PR
Notice, this pr will not address the comment as it uses an internal vineyardd with local metadata.
Related issue number
Fix part of https://github.com/v6d-io/v6d/issues/1877