Set `torch.multiprocessing` start method as 'spawn'

Set torch.multiprocessing start method as 'spawn'. Otherwise the following error would be raised.

Megatron-LM/megatron/core/extensions/transformer_engine.py", line 957, in get_cpu_offload_context
    context, sync_func = _get_cpu_offload_context(
  File "/opt/conda/lib/python3.8/site-packages/transformer_engine/pytorch/cpu_offload.py", line 502, in get_cpu_offload_context
    cpu_offload_handler = AsyncDoubleBufferGroupOffloadHandler(
  File "/opt/conda/lib/python3.8/site-packages/transformer_engine/pytorch/cpu_offload.py", line 312, in __init__
    self.d2h_stream = torch.cuda.Stream()
  File "/opt/conda/lib/python3.8/site-packages/torch/cuda/streams.py", line 35, in __new__
    return super().__new__(cls, priority=priority, **kwargs)
RuntimeError: CUDA error: initialization error
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

NVIDIA / Megatron-LM

Set `torch.multiprocessing` start method as 'spawn' #1285