Many things can go wrong during a complex and heavy computation task, so you want to isolate it from the application by running it in a separate process. That allows you to keep using the application while the computation is running, you can stop the process anytime (by killing the computation process), and you can prevent the application from crashing if the computation process is terminated (e.g., due to some errors in PyTorch or due to the operating system killing the process because of using too much resources).
You can use MONAIAuto3DSeg extension as an example. It launches auto3dseg_segresnet_inference.py script in a separate Python process.
Note that you would still use Slicer's Python environment, the only change would be that the inference is running in a separate process.
It seems that inference is run inside the Slicer process: https://github.com/SlicerMorph/SlicerMEMOS/blob/4b30659ee57b2be49b58c99822e16a221b12c4d0/MEMOS/MEMOS.py#L245
Many things can go wrong during a complex and heavy computation task, so you want to isolate it from the application by running it in a separate process. That allows you to keep using the application while the computation is running, you can stop the process anytime (by killing the computation process), and you can prevent the application from crashing if the computation process is terminated (e.g., due to some errors in PyTorch or due to the operating system killing the process because of using too much resources).
You can use MONAIAuto3DSeg extension as an example. It launches auto3dseg_segresnet_inference.py script in a separate Python process.
Note that you would still use Slicer's Python environment, the only change would be that the inference is running in a separate process.