marieai / marie-ai

Integrate AI-powered Document Analysis Pipelines
MIT License
58 stars 4 forks source link

Ensure that data for transission is always safely encoded when including Numpy types #45

Closed gregbugaj closed 1 year ago

gregbugaj commented 1 year ago

Data needs to be safely encoded for transmission where there is a Numpy objects present.

Example return data will throw an Error:

        np_arr = np.array([1, 2, 3])

        out = [
            {"sample": 112, "complex": ["a", "b"]},
            {"sample": 112, "complex": ["a", "b"], "np_arr": np_arr},
        ]

Exception :

marie.excepts.BadServer: request_id: "04899407ec50441bb90444987b14f303"
status {
  code: ERROR
  description: "ValueError(\'Unexpected type\')"
  exception {
    name: "ValueError"
    args: "Unexpected type"
    stacks: "Traceback (most recent call last):\n"
    stacks: "  File \"/dev/marieai/marie-ai/marie/serve/runtimes/worker/__init__.py\", line 265, in process_data\n    result = await self._request_handler.handle(\n"
    stacks: "  File \"/dev/marieai/marie-ai/marie/serve/runtimes/worker/request_handling.py\", line 438, in handle\n    _ = self._set_result(requests, return_data, docs)\n"
    stacks: "  File \"/dev/marieai/marie-ai/marie/serve/runtimes/worker/request_handling.py\", line 363, in _set_result\n    requests[0].parameters = params\n"
    stacks: "  File \"/dev/marieai/marie-ai/marie/types/request/data.py\", line 276, in parameters\n    self.proto_wo_data.parameters.update(value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 820, in update\n    _SetStructValue(self.fields[key], value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 746, in _SetStructValue\n    struct_value.struct_value.update(value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 820, in update\n    _SetStructValue(self.fields[key], value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 749, in _SetStructValue\n    struct_value.list_value.extend(value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 838, in extend\n    self.append(value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 834, in append\n    _SetStructValue(self.values.add(), value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 746, in _SetStructValue\n    struct_value.struct_value.update(value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 820, in update\n    _SetStructValue(self.fields[key], value)\n"
    stacks: "  File \"/dev/marieai/marie-as-service/venv/lib/python3.8/site-packages/google/protobuf/internal/well_known_types.py\", line 751, in _SetStructValue\n    raise ValueError(\'Unexpected type\')\n"
    stacks: "ValueError: Unexpected type\n"
    executor: "ExtractExecutor"
  }
}

Proposed is @safely_encoded decorator that will get our data ready for transmission by converting Object to JSON and back to Object.

    @safely_encoded
    @requests(on="/status")
    def status(self, **kwargs):
        np_arr = np.array([1, 2, 3])
        out = [
            {"sample": 112, "complex": ["a", "b"]},
            {"sample": 112, "complex": ["a", "b"], "np_arr": np_arr},
        ]
        return out
gregbugaj commented 1 year ago

Initial implementation allows us to use @safely_encoded decorator however it has to be a first decorator in the chain. This should be generic decorator and not tied to the @request decorator.

    @requests
    @safely_encoded
    def encoded(self, docs, **kwargs):
        return {}

Additionally we are able to return arraylike objects from methods decorated with @request

    @requests
    def encoded(self, docs, **kwargs):
        return [{"a":1}, {"b":2}]