iterative / mlem

🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞
https://mlem.ai
Apache License 2.0
716 stars 44 forks source link

Implementation request_serialize and response_serialize #567

Open alanmelone opened 1 year ago

alanmelone commented 1 year ago

Hi, here is Akvelon team. We want to support request_serialize and response_serialize flags. But we need information about input and output data for values, which are not a default. Could you please provide more information how works these parameters (request_serialize and response_serialize)? It's need for right implementation of our R2 Release clients

mike0sv commented 1 year ago

So it works like this. Model methods have argument types and return types defined as DataType. Those can be arbitrary supported python objects like numpy arrays, pandas data frames or native python objects. Now, when you serve the model, you need a way to turn those objects into bytes to send over the network (in the request or in the response). There are two ways to do this: 1) turn the object into python dict and create a json from it (eg pd.df.to_dict(..)) or turn the object into bytestream directly (eg if it is a tensor created from image we can turn it back into an image). So this is where serializers come into play (actually they are both serializers and deserializers). There are two types of them respectively (DataSerializer and BinarySerializer). Each DataType also has a default serializer which is used if nothing else is configured. The request_serializer and response_serializer options allow to use some other serializer instead of the default one.

Lastly, DataSerializer also provides a schema for the payload, which in turn is used by FastAPI to create OpenAPI spec for the method. In python you can get it like this model = serializer.get_model(); schema = model.schema(). I can help you with the code to get correct serializer from server configuration and model.

Btw in mlem the client is constructed dynamically from the /interface.json endpoint, which already has the correct data types and serializers for this server instance