chore: Update TRT-LLM checkpoint scripts to v0.10 and Fix Github Actions Pipeline

Two Sets of Changes in this PR:

1. Upgrading TensorRT-LLM Checkpoint Scripts

Upgrading convert_checkpoint.py scripts for gpt2, llama, and opt to support tensorrt_llm v0.10.0. Source for code changes: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.10.0/examples

2. GitHub Actions Workflow Fix

(Credit to @rmccorm4 for figuring this out.) Currently, the GitHub Actions pipeline of the Triton CLI is failing the test case: test_non_llm[http]. Currently, during testing, when a Mock Server (ScopedTritonServer) is started, Popen("triton start") is called and a new process is created. After this, when an individual triton command is tested, another Popen("triton ...") command is called, creating a new sub-process. So, when the sub-process fails or errors out, then it returns an error code to the process, but the original process doesn't terminate and ends up hanging in a zombie state, where the original process is no longer doing any work, but is still running as a valid process. This is what is causing the indefinite hanging of the test case (test_non_llm(http)). One potential reason for the sub-process hanging is a port conflict from an existing tritonserver instance from not terminating correctly and occupying the ports 8000-8002.

For now, these hanging tests will be skipped in Github Actions, and still run in Gitlab. This will be further investigated/fixed in a follow-up PR.

triton-inference-server / triton_cli