[Serving] Implement SageMaker Secure Mode & support for multiple data sources - Githubissues

deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution

Apache License 2.0

183 stars 58 forks source link

[Serving] Implement SageMaker Secure Mode & support for multiple data sources #2042

Closed ethnzhng closed 1 month ago

ethnzhng commented 1 month ago

This draft PR adds the initial implementation of SageMaker Secure Mode, as well as support for multiple data sources.

I have tested the security control scenarios locally in Docker, and am currently working on adding unit tests and integration tests.

Summary of functionality

Basic support for additional model data sources

Install requirements.txts found in trusted additional data sources.
Note: Only the serving.properties found in the main model directory /opt/ml/model is applied (same as existing behavior). In the scope of this PR, serving.properties found in other data sources are ignored.

Specify trusted and untrusted data source paths

Each additional data source, as well as the main model directory /opt/ml/model can be designated as trusted or untrusted by SM platform.
Trusted paths are not subject to security scans, while untrusted are.

Configure individual DLC-level security controls which scan untrusted paths

If Secure Mode and required associated variables are set, run the enabled individual security controls for each untrusted path. Upon any security violation, we fast-fail and exit the model server.
- Disallow requirements.txt – check for requirements.txt file
- Disallow pickle files – check for files with pickle file extensions
- Disallow trust_remote_code – check if this option is set via env vars or untrusted serving.properties
- Disallow custom entryPoint – env vars or untrusted serving.properties can only set entryPoint to built-in modules e.g. djl_python. . Trusted serving.properties can set to anything.
- Disallow Jinja chat_template – check if this field is set in tokenizer_config.json file

Example scenarios:

If Secure Mode is not enabled, no change from normal LMI flow
If Secure Mode is enabled:
- Trusted base model + trusted draft model --> No additional restrictions
- Untrusted base model + trusted draft model --> Fast-fail if any security violations