Closed nadavmisgav closed 2 years ago
Have you solved this problem? I also have the same problem.
I am now able to train on my custom dataset, I changed multiple stuff and kinda of lost track. I think the main bits were,
cfg.num_workers
and disable all parallel computing.Including my diff.txt in src/*
if you want to check it out.
I am now able to train on my custom dataset, I changed multiple stuff and kinda of lost track. I think the main bits were,
- Remove the usage of
cfg.num_workers
and disable all parallel computing.- Reducing batch sizes and frequencies.
Including my diff.txt in
src/*
if you want to check it out.
hi. how much cluster mIoU you get on custom dataset?
Hello there,
I was trying to train on my custom dataset, currently holding about 60 images for training and 60 for validation.
Using the following
train_config.yml
I have reduced the
val_freq
and some other freqs to match the small dataset, but while running thetrain_segmentation.py
I have encountered the following error,and no checkpoint of the model is saved (the execution has completed).
I am using Google Collab to run this training.
Any suggestion on what may be the problem (I am aware that the dataset is too small for a strong model) ?