Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.59k stars 284 forks source link

Suggestions for improving training speed (especially when input data is large) #48

Closed Robotwithsoul closed 5 years ago

Robotwithsoul commented 5 years ago

Hi, In memory.py, I suggested to change a little bit of your codes, which will be helpful for improving the training speed (especially when the input image data is 3D)

T,T1=[],[]

device=torch.device('cuda') for i in range (0,4):

A=np.zeros((100,100,100),dtype=np.int)
B=torch.tensor(A,dtype=torch.int)
T.append(B)

for i in range (0,4): A=np.zeros((100,100,100),dtype=np.int) B=torch.tensor(A,dtype=torch.int) T1.append(B)

This line is used for initilization

M=torch.stack(T).to(dtype=torch.float32, device=device).div_(255)

Comparison

timea=timeit.defaulttimer() M=torch.stack(T).to(dtype=torch.float32, device=device).div(255) timeb=timeit.defaulttimer() N=torch.stack(T1).to(device=device).to(dtype=torch.float32).div(255) timec=timeit.default_timer()

print("time1 is:{}\n time2 is:{}".format(timeb-timea,timec-timeb))

Kaixhin commented 5 years ago

How odd - do you have any idea why this is the case? Thanks a lot for the script for quick testing - can confirm that I am indeed seeing a speedup (not so significant for smaller tensors, but sure). Feel free to submit a PR for this change!

Robotwithsoul commented 5 years ago

I'm not sure, but I guess this is because that the following codes perform the data type conversion in CPU

.to(dtype=torch.float32, device=device)

while the suggested codes perform the data type conversion in GPU

.to(device=device).to(dtype=torch.float32)