Closed vruusmann closed 3 years ago
Right now the signature of the main Java evaluation method is this:
public java.util.Map<FieldName, ?> evaluate(java.util.Map<FieldName, ?> arguments);
The API should be extended to make alternative "custom dict/map type"-oriented method signatures possible.
For example, when passing Python dicts using the Pickle data format:
public net.razorvine.pickle.objects.ClassDict evaluate(net.razorvine.pickle.objects.ClassDict arguments);
The interconversion between ClassDict
and Map<FieldName, ?>
should happen atomically within the Java library code.
Current data flow:
JavaGateway.dict2map(dict)
abstract method: https://github.com/jpmml/jpmml-evaluator-python/blob/0.4.2/jpmml_evaluator/__init__.py#L15-L16JavaGateway.map2dict(map)
abstract method: https://github.com/jpmml/jpmml-evaluator-python/blob/0.4.2/jpmml_evaluator/__init__.py#L18-L19It appear to be the case that steps 1 and 3 are rate-limiting when dealing with larger data batches. A possible solution would be to avoid dict/map conversions in the Python layer altogether.
Refactored data flow:
This approach could be used for passing single data records (a single dict), or passing batches of data records (list of dicts, Pandas' data frame).
The Pickle data format can be read and written using the awesome Pickle library.