deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution
Apache License 2.0
182 stars 59 forks source link

[Neuron] Fix Neuron compilation logging #2095

Closed a-ys closed 1 week ago

a-ys commented 1 week ago

Description

Fixes Neuron compilation logging. Example logs that were missing:

2024-02-21 01:04:37.000772: 246 INFO ||NEURON_CC_WRAPPER||: Using a cached neff at /opt/ml/compilation/cache/neuronxcc-2.12.54.0+f631c2365/MODULE_b135f12bed698159448c+ede6753c/model.neff. Exiting with a successfully compiled graph.
2024-02-21 01:04:37.000823: 247 INFO ||NEURON_CACHE||: Compile cache path: /opt/ml/compilation/cache
2024-02-21 01:04:37.000875: 247 INFO ||NEURON_CC_WRAPPER||: Using a cached neff at /opt/ml/compilation/cache/neuronxcc-2.12.54.0+f631c2365/MODULE_b702d7ddf38e1da4abf8+ede6753c/model.neff. Exiting with a successfully compiled graph.
2024-02-21 01:04:38.000380: 248 INFO ||NEURON_CC_WRAPPER||: Using a cached neff at /opt/ml/compilation/cache/neuronxcc-2.12.54.0+f631c2365/MODULE_02b140725c88b5ccd391+ede6753c/model.neff. Exiting with a successfully compiled graph.
...

Caused by additional vLLM dependency introduced in 0.28.0 container. The logger in vLLM (defined here) does not specify disable_existing_loggers=False. We should fix in upstream vLLM, but in the meantime, this PR manually enables the relevant loggers to fix the issue.