[BUG]: Max Serve Deploy.sh IndexError

Bug description

When running python3 triton-inference.py --input "Paris is the [MASK] of France." the following is returned:

Processing input...
Input processed.

Executing model...
Model executed.

Traceback (most recent call last):
  File "/home/anthony/max/examples/inference/bert-python-torchscript/triton-inference.py", line 121, in <module>
    main()
  File "/home/anthony/max/examples/inference/bert-python-torchscript/triton-inference.py", line 104, in main
    logits = torch.from_numpy(outputs[0, masked_index, :])
IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed

Steps to reproduce

Follow instructions from https://docs.modular.com/max/serve/get-started with the system info mentioned in other sections

note that the run command was modified as per this ticket was in order to pass this issue here https://github.com/modularml/max/issues/181, although that should not affect this ticket
note running the python command directly instead of deploy script for same reason above, again should not make a difference

System information

- WSL 2
- Latest image `public.ecr.aws/modular/max-serving-de:latest`
- Instructions from https://docs.modular.com/max/serve/get-started
- Last commit SHA f2f49795d6f27039d24c14d92168c7892c3b2b49
- https://github.com/modularml/max/blob/main/examples/inference/bert-python-torchscript/deploy.sh

modularml / max

[BUG]: Max Serve Deploy.sh IndexError #182

Bug description

Steps to reproduce

System information