Fix vLLM profiler bug, add fallback logic to server start, cleanup

Fix vLLM profiler bug
- profiler script was using hard-coded TRT LLM backend and sending the wrong payload (sampling parameters) to vLLM backend, causing issues with calculations and expected number of tokens
- vLLM has a default "max_tokens" of 16, and we weren't correctly overriding it if backend == "trtllm"
- Added logic to get model backend and pass it to profiler script for context
Add fallback logic to triton server start
- Rather than deciding between default of "local" or "docker", which has a better default depending on environment, I moved to a "fallback" logic by default. So it will try "local" first, and if it fails to find "tritonserver" binary then it will try "docker" mode.
- If you explicitly specify a mode, it will only try that one.
Unify logic and helper functions between "bench" and other commands
Add more defaults to argparse help texts

Locally fixed and verified that "all-in-one" bench workflow, and individual subcommand workflows behave the same:

triton bench -m gpt2

and

triton repo clear
triton repo add -m gpt2
triton server start
triton model profile -m gpt2

triton-inference-server / triton_cli