This bug came up when i decided to Train ColBERT on a custom Dataset, but it was taking Forever, so I tried diagnosing the problem, seems that it uses torch.multiprocessing to divide tasks, but whenever a Task Queue is formed, the code gets stuck on the get() method
#Sample code to reproduce the Problem
import torch
import torch.multiprocessing as mp
try:
mp.set_start_method('spawn', force=True)
except RuntimeError:
print('Hello')
return_value_queue = mp.Queue()
#return_values = sorted([return_value_queue.get() for _ in all_procs]) #The Code gets stuck here
print(return_value_queue.get()) #To Reproduce
This bug came up when i decided to Train ColBERT on a custom Dataset, but it was taking Forever, so I tried diagnosing the problem, seems that it uses torch.multiprocessing to divide tasks, but whenever a Task Queue is formed, the code gets stuck on the get() method
### Versions torch version = 1.13.1+cu117
Occurs while Training