sacmehta / ESPNetv2

A light-weight, power efficient, and general purpose convolutional neural network
MIT License
447 stars 70 forks source link

Classes Number maybe not reasonable #11

Closed TianDe11 closed 5 years ago

TianDe11 commented 5 years ago

Hello, Writers, Firstly thanks for your share. I find that Cityscapes Datasets includes 34 classes, thus the segmentation code write for only 20 classes, do you transform the label values to 20 classes before training?

sacmehta commented 5 years ago

Please use Cityscapes code to convert the labels

TianDe11 commented 5 years ago

ok...

TianDe11 commented 5 years ago

I now use s=0.5, and 9 classes, total 44000 images,The sky and building result is very good, but the person classes result is so terrible that cannot recognize sometimes, Could you give me some advice?

sacmehta commented 5 years ago

How long did you train? Make sure your labels are correct

TianDe11 commented 5 years ago

About less than 1 hour a epoch, I trained 50 epoches, Label is checked repeatly. I find when s=1.5, person is recognized in first few epoch, but s = 0.5, it is occured in about 20-30 epoch and other classes perform worse than person not be found. Now, It is a little embarrassed that when other classes performs nice, people cannot be splited; when people could, other classes looks not very satisfying... the mIou is 40% level.

sacmehta commented 5 years ago

Is your dataset balanced?

TianDe11 commented 5 years ago

the number of person is fewer, and I think the aera of the sky and building is much more bigger than person, but I 'm confused how to solve it, If you add extra augmentation to person images, other classes are also become more

sacmehta commented 5 years ago

You can compute class-wise weights and pass it to loss function.

Let us assume that you have 2 classes whose area distribution in the dataset is 1000 and 50. Obviously, class 1 has more area and network will be biased towards this class. You can take inverse of these areas now to compute class-wise weights. Now you will have a weight vector with 0.001 and 0.02 as values. When you pass this vector to loss function, class with low pixel area is panelized more.

TianDe11 commented 5 years ago

ok,thanks for your idea.

TianDe11 commented 5 years ago

Hi, I find there are pretrained models for segmentation, but got state_dict key errors when load.

Unexpected key(s) in state_dict: "module.net.level1.conv.weight", "module.net.level1.bn.weight", "module.net.level1.bn.bias" ......"module.level5.4.bn.weight", "module.level5.4.act.weight" Missing key(s) in state_dict: "module.level1.conv.weight", "module.level1.bn.running_var", "module.level1.bn.running_mean", "module.level1.bn.bias"......"module.project_l2.act.weight", "module.project_l1.1.conv.weight" Is there something wrong?

sacmehta commented 5 years ago

You need to first wrap the model inside the DataParallel wrapper and then load the weights, something like this:

model = net.EESPNet_Seg(20, s=1.0)
model = nn.DataParallel(model)
model.load_state_dict(torch.load('../pretrained/espnet.pth'))
TianDe11 commented 5 years ago

that's right, thank you