Unable to train the network

xmba15 / rail_marking

proof-of-concept program that detects rail-track with semantic segmentation for autonomous train system

MIT License

71 stars 17 forks source link

Unable to train the network #2

Open AmirAliEbrahimi opened 3 years ago

AmirAliEbrahimi commented 3 years ago

Hi, Thank you for your awesome project. I downloaded the dataset and try to train it by myself using the train script. but I encounter this error :

File "/export/tmp/ebrahimi/rail_marking/scripts/segmentation/./../../rail_marking/segmentation/models/ohem_ce_loss.py", line 29, in forward loss_hard = loss[loss > self.thresh.to(device)] RuntimeError: CUDA error: an illegal memory access was encountered

Also, For the dataset, I merge all of the jpegs, pngs and jsons and put them in a folder as set it as the --data_path argument of the script. Is it ok?

xmba15 commented 3 years ago

@AmirAliEbrahimi Can you please clarify how many label classes are you training? For this repo, I already modified the original dataset to a new one with only 3 classes.

If you trained with different number of classes, you need to create a new cfg file, in cfg directory, and replace the num_classes path.

xmba15 commented 3 years ago

here is the logic for the dataloader. in my dataset, the images are of jpg format and groundtruths are of png format only; so I differentiate them using these formats. https://github.com/xmba15/rail_marking/blob/master/rail_marking/segmentation/data_loader/railsem_mask_dataset.py#L51-L55

If your dataset is comprised differently, you need to modify the data loader part accordingly.

AmirAliEbrahimi commented 3 years ago

Thanks for the replay, currently I am using the original RailSem19 and I try to train it with all the classes. so I will try a new cfg file. For ground truths, should I use the 8uC1 label map images provided by the dataset, or use the images annotated by the JSON files?

xmba15 commented 3 years ago

@AmirAliEbrahimi sorry for the late reply. the ground truth should be 8UC1 label map. please try, if you still have problems with the trainining, maybe I will add the scripts to train the original (not modified) dataset.

AmirAliEbrahimi commented 3 years ago

@xmba15 Thank you for your response, I would appreciate it if you could add the scripts to train the original dataset.

lmcggg commented 1 year ago

感谢您的重播，目前我正在使用原始的 RailSem19，我尝试用所有类来训练它。所以我会尝试一个新的 cfg 文件。对于基本事实，我应该使用数据集提供的 8uC1 标签地图图像，还是使用 JSON 文件注释的图像？

My dear friend, I am so sorry to disturb you, but I am curious if you have finished all the training. I would be honored if I could learn from your work

Zyjhubei commented 1 year ago

@AmirAliEbrahimi Can you please clarify how many label classes are you training? For this repo, I already modified the original dataset to a new one with only 3 classes.

If you trained with different number of classes, you need to create a new cfg file, in cfg directory, and replace the num_classes path.

Hi，i am curious how to modify the original dataset to the new one with 3 classes，I would be very honored if you replied

AmirAliEbrahimi commented 1 year ago

@lmcggg @Zyjhubei

Hi, Unfortunately, I didn't modify the dataset or the code at that time, and I don't work on this project anymore. Sorry about that