google / ml-metadata

For recording and retrieving metadata associated with ML developer and data scientist workflows.
https://www.tensorflow.org/tfx/guide/mlmd
Apache License 2.0
616 stars 145 forks source link

Cannot filter by the ID of a parent context #155

Closed irvinktang closed 1 year ago

irvinktang commented 2 years ago

I'm currently trying to use mlmd.ListOptions to write more complex queries. I attempted to run the following query just for test purposes:

store.get_contexts(mlmd.ListOptions(filter_query='parent_contexts_a.id = 2'))

However, I get the following error message:

WARNING:absl:mlmd client InternalError: mysql_query failed: errno: Unknown column 'table_1.id' in 'where clause', error: Unknown column 'table_1.id' in 'where clause'
---------------------------------------------------------------------------
_InactiveRpcError                         Traceback (most recent call last)
File /opt/env/lib/python3.9/site-packages/ml_metadata/metadata_store/metadata_store.py:213, in MetadataStore._call_method(self, method_name, request, response)
    212 try:
--> 213   response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec))
    214 except grpc.RpcError as e:
    215   # RpcError code uses a tuple to specify error code and short
    216   # description.
    217   # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode

File /opt/env/lib/python3.9/site-packages/grpc/_channel.py:923, in _UnaryUnaryMultiCallable.__call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
    921 state, call, = self._blocking(request, timeout, metadata, credentials,
    922                               wait_for_ready, compression)
--> 923 return _end_unary_response_blocking(state, call, False, None)

File /opt/env/lib/python3.9/site-packages/grpc/_channel.py:826, in _end_unary_response_blocking(state, call, with_call, deadline)
    825 else:
--> 826     raise _InactiveRpcError(state)

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.INTERNAL
    details = "mysql_query failed: errno: Unknown column 'table_1.id' in 'where clause', error: Unknown column 'table_1.id' in 'where clause'"
    debug_error_string = "{"created":"@1650487173.166142966","description":"Error received from peer ipv4:172.20.183.237:8080","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"mysql_query failed: errno: Unknown column 'table_1.id' in 'where clause', error: Unknown column 'table_1.id' in 'where clause'","grpc_status":13}"
>

During handling of the above exception, another exception occurred:

InternalError                             Traceback (most recent call last)
Input In [45], in <cell line: 1>()
----> 1 store.get_contexts(mlmd.ListOptions(filter_query=('parent_contexts_a.id=23')))

File /opt/env/lib/python3.9/site-packages/ml_metadata/metadata_store/metadata_store.py:1034, in MetadataStore.get_contexts(self, list_options)
   1020 """Gets contexts.
   1021 
   1022 Args:
   (...)
   1031   errors.InvalidArgument: if list_options is invalid.
   1032 """
   1033 request = metadata_store_service_pb2.GetContextsRequest()
-> 1034 return self._call_method_with_list_options('GetContexts', 'contexts',
   1035                                            request, list_options)

File /opt/env/lib/python3.9/site-packages/ml_metadata/metadata_store/metadata_store.py:980, in MetadataStore._call_method_with_list_options(self, method_name, entity_field_name, request_without_list_options, list_options)
    977 if return_size and return_size < MAX_NUM_RESULT:
    978   request.options.max_result_size = return_size
--> 980 self._call(method_name, request, response)
    981 entities = getattr(response, entity_field_name)
    982 for x in entities:

File /opt/env/lib/python3.9/site-packages/ml_metadata/metadata_store/metadata_store.py:188, in MetadataStore._call(self, method_name, request, response)
    186 while True:
    187   try:
--> 188     return self._call_method(method_name, request, response)
    189   except errors.AbortedError:
    190     num_retries -= 1

File /opt/env/lib/python3.9/site-packages/ml_metadata/metadata_store/metadata_store.py:218, in MetadataStore._call_method(self, method_name, request, response)
    213   response.CopyFrom(grpc_method(request, timeout=self._grpc_timeout_sec))
    214 except grpc.RpcError as e:
    215   # RpcError code uses a tuple to specify error code and short
    216   # description.
    217   # https://grpc.github.io/grpc/python/_modules/grpc.html#StatusCode
--> 218   raise _make_exception(e.details(), e.code().value[0])

InternalError: mysql_query failed: errno: Unknown column 'table_1.id' in 'where clause', error: Unknown column 'table_1.id' in 'where clause'

I'm able to filter by parent_contexts_a.name and parent_contexts_a.type. Any insight on what I may be doing wrong or is this a bug?

Thanks

BrianSong commented 2 years ago

Hi @irvinktang, sorry for getting back to this issue late and thanks for bringing this up!

I have successfully reproduce the error on my end and it seems that it is a bug from MLMD side. I have created an internal bug for this. Will update this issue if I have any progress.

XinranTang commented 1 year ago

Hi @irvinktang, this feature is now supported in commit: e60c57fbb4d07eb692edf41b638678b932314104.