deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution
Apache License 2.0
182 stars 59 forks source link

[Neo] Fixing various Neo compilation/quantization script bugs #2115

Open a-ys opened 1 week ago

a-ys commented 1 week ago

Description

This PR includes various fixes for the Neo Neuron compilation script. 9253e422729b0f9453773d38835a1212a597c1c5 Hard-codes engine=Python in to the Neo Neuron compilation script so that errors in customer serving.properties do not cause compilation to fail.

405ea55b8013579ea84155551a36c29c59fbb182 Removes a hanging reference to TARGET_INSTANCE_TYPE in the Neo Quantization script.

36810ccd7fb555e4c86b464d42c7150a60542a9a Adds logic to pass through engine and option.entryPoint to the outputted serving.properties. This is done so that when we compile with hardcoded values engine=Python and option.entryPoint=djl_python.transformers_neuronx, customer values for these are passed through to support custom entrypoints.

c1556ec0028e7bfafd1ed0c692cf7a4b2d37a117 Changes the output file format to this following: