Unrecognized configuration class to build an AutoTokenizer for microsoft/Florence-2-base-ft

Description I was trying to host https://huggingface.co/microsoft/Florence-2-base-ft using Triton's Python-based vLLM backend and encounter an error (Unrecognized configuration class <class 'transformers_modules.microsoft.Florence-2-base.ee1f1f163f352801f3b7af6b2b96e4baaa6ff2ff.configuration_florence2.Florence2Config'> to build an AutoTokenizer.)

Triton Information Images: nvcr.io/nvidia/tritonserver:24.09-vllm-python-py3 I pulled the images from the Nvidia image repository and use it as it is

To Reproduce

Set up a model repo directory: ~/work/model_repository/florence-2-base-ft/1
Downloaded model.json and config.pbtxt as suggested in the tutorial

wget -P model_repository/florence-2-base-ft/1 https://raw.githubusercontent.com/triton-inference-server/vllm_backend/r23.12/samples/model_repository/vllm_model/1/model.json

Modified model.json by adding trust_remote_code = true. The final json file is as follows:

{
"model":"microsoft/Florence-2-base-ft",
"disable_log_requests": "true",
"gpu_memory_utilization": 0.5,
"enforce_eager": "true",
"trust_remote_code": "true"
}

wget -P model_repository/florence-2-base-ft/ https://raw.githubusercontent.com/triton-inference-server/vllm_backend/r23.12/samples/model_repository/vllm_model/config.pbtxt. No modification is done and it is as follows:
```
backend: "vllm"
```

instance_group [ { count: 1 kind: KIND_MODEL } ]


3. Run trition inference server

- cd ~/work
- docker run --gpus all -it --net=host --rm -p 8001:8001 --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 -v ./:/models -w /work nvcr.io/nvidia/tritonserver:24.09-vllm-python-py3 tritonserver --model-store /models

**Error**
![image](https://github.com/user-attachments/assets/e753281d-2f44-4ffa-be13-3b7420e2283e)

triton-inference-server / server

Unrecognized configuration class to build an AutoTokenizer for microsoft/Florence-2-base-ft #7726