Open ockaro opened 6 months ago
Hi @ockaro,
Can you please verify there is a clearml triton pod running? Also, how did you register the model? What was the clearml-serving command you used?
Side note - can you please move the issue to clearml-serving repo? This seems not to be related to the helm chart, but to the serving itself.
Hi @jkhenning ,
thanks for moving the issue! Btw I never had an issue registering a model using clearml pro together with the docker container setup. I wasn't even required to name the model 'model.Invalid model name: Could not determine backend for model 'advanced_basic_classifier' with no backend in model configuration. Expected model name of the form 'model.<backend_name>'.
The triton pod is up and running. The mentioned error message occurs within the triton pod.
I tried these two commands in order to register the model
clearml-serving --id ad16b8ae3e2840c1b1b6eb94bbcf78f4 model add --engine triton --endpoint "advanced_basic_classifier.pytorch" --preprocess "src/preprocessing/preprocess.py" --model-id 837276fc8d8a443fb91f48d722300b0a --input-size 1 64 --input-name "INPUT__0" --input-type float32 --output-size 1 11 --output-name "OUTPUT__0" --output-type float32 --aux-config platform=pytorch_libtorch
clearml-serving --id ad16b8ae3e2840c1b1b6eb94bbcf78f4 model add --engine triton --endpoint "advanced_basic_classifier.pytorch" --preprocess "src/preprocessing/preprocess.py" --model-id 837276fc8d8a443fb91f48d722300b0a --aux-config .\config.pbtxt
where the config.pbtxt looks like this
backend: "pytorch"
platform: "pytorch_libtorch"
input [
{
name: "INPUT__0"
data_type: TYPE_FP32
dims: [1, 64]
}
]
output [
{
name: "OUTPUT__0"
data_type: TYPE_FP32
dims: [1, 11]
}
]
Hi @jkhenning, are there any news on this issue? :)
Describe the bug a clear and concise description of what the bug is.
Hi there, it seems that when adding a pytorch model to the self-hosted clearml-serving the platform also needs to be added. But neither specifying the platform with the aux-config flag, nor passing a config.pbtxt file with the aux-config flag works. In both cases I get an error
E0405 14:55:27.962135 35 model_repository_manager.cc:996] Poll failed for model directory 'advanced_basic_classifier.pytorch': unexpected 'platform' and 'backend' pair, got:, pytorch
What's your helm version?
3.14.3
What's your kubectl version?
1.25.2
What's the chart version?
7.8.1
Enter the changed values of values.yaml?
-- ClearMl generic configurations
clearml: apiAccessKey: apiSecretKey: apiHost: https://api.***.com filesHost: https://files.***.com webHost: https://app.***.com servingTaskId: "fec7d23cc2b848b48d15041ce965ed81"
-- ClearML serving inference configurations
clearml_serving_inference:
-- Ingress exposing configurations