This PR includes various fixes for the Neo Neuron compilation script.
9253e422729b0f9453773d38835a1212a597c1c5 Hard-codes engine=Python in to the Neo Neuron compilation script so that errors in customer serving.properties do not cause compilation to fail.
405ea55b8013579ea84155551a36c29c59fbb182 Removes a hanging reference to TARGET_INSTANCE_TYPE in the Neo Quantization script.
36810ccd7fb555e4c86b464d42c7150a60542a9a Adds logic to pass through engine and option.entryPoint to the outputted serving.properties. This is done so that when we compile with hardcoded values engine=Python and option.entryPoint=djl_python.transformers_neuronx, customer values for these are passed through to support custom entrypoints.
c1556ec0028e7bfafd1ed0c692cf7a4b2d37a117 Changes the output file format to this following:
Files in the input directory are directly copied to the output.
The outputs of compilation are saved in a subdirectory of the output: optimized_model
The outputted serving.properties sets model_id=./optimized_model so that the compiled model is used during deployment.
Description
This PR includes various fixes for the Neo Neuron compilation script. 9253e422729b0f9453773d38835a1212a597c1c5 Hard-codes
engine=Python
in to the Neo Neuron compilation script so that errors in customer serving.properties do not cause compilation to fail.405ea55b8013579ea84155551a36c29c59fbb182 Removes a hanging reference to
TARGET_INSTANCE_TYPE
in the Neo Quantization script.36810ccd7fb555e4c86b464d42c7150a60542a9a Adds logic to pass through
engine
andoption.entryPoint
to the outputted serving.properties. This is done so that when we compile with hardcoded valuesengine=Python
andoption.entryPoint=djl_python.transformers_neuronx
, customer values for these are passed through to support custom entrypoints.c1556ec0028e7bfafd1ed0c692cf7a4b2d37a117 Changes the output file format to this following:
optimized_model
model_id=./optimized_model
so that the compiled model is used during deployment.