Closed TrinhQuocNguyen closed 4 years ago
I have modified the source code and can able to load any huge dataset. Thank you.
您好,请问您修改了哪部分源码?您构建大数据集时是怎么构建的?希望得到您的回复 @TrinhQuocNguyen
I have modified the source code and can able to load any huge dataset. Thank you.
Hi Trinh,
I am observing the same issue when training and testing on original data. The testing on original test set can be completed with 300GB of swap memory, and the training will get killed after several epochs under this setting.
Is it possible to share your way of solving it?
Thanks!
Hi,
The issue cause when loading the dataset. The entire dataset is load at once to the memory and that cause crashing.
self.A_paths = make_dataset(self.dir_A)
self.B_paths = make_dataset(self.dir_B)
# self.A_imgs, self.A_paths = store_dataset(self.dir_A)
# self.B_imgs, self.B_paths = store_dataset(self.dir_B)
Try to use make_dataset function istead of store_dataset which load the entire dataset to the memory at once. The make_dataset function only loads the paths to the images. You have to modify the getitem to load the image from the path,
A_path = self.A_paths[index % self.A_size]
B_path = self.B_paths[index % self.B_size]
A_img = Image.open(A_path).convert('RGB')
B_img = Image.open(B_path).convert('RGB')
Thanks
Hi TAMU-VITA, Thank you for awesome work. I have tried to train the model on a larger dataset, but the RAM memory was increasing during the time of loading data. Then as the consequence, it's got killed.
Is there any place in the dataloader class that we can modify to avoid above problem?
Thank you.