charlesq34 / pointnet2

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
Other
3.07k stars 895 forks source link

scannet_dataset.py #34

Open stephencuinujia opened 6 years ago

stephencuinujia commented 6 years ago

In the data preparation stage, what does the bold area do and where are those parameter values come from (1.5, 0.2, etc) ?

Thank you very much for the help!

class ScannetDatasetWholeScene(): def init(self, root, npoints=8192, split='train'): self.npoints = npoints self.root = root self.split = split self.datafilename = os.path.join(self.root, 'scannet%s.pickle'%(split)) with open(self.data_filename,'rb') as fp: self.scene_points_list = pickle.load(fp) self.semantic_labels_list = pickle.load(fp) if split=='train': labelweights = np.zeros(21) for seg in self.semantic_labelslist: tmp, = np.histogram(seg,range(22)) labelweights += tmp labelweights = labelweights.astype(np.float32) labelweights = labelweights/np.sum(labelweights) self.labelweights = 1/np.log(1.2+labelweights) elif split=='test': self.labelweights = np.ones(21) def getitem(self, index): point_set_ini = self.scene_points_list[index] semantic_seg_ini = self.semantic_labels_list[index].astype(np.int32) coordmax = np.max(point_set_ini,axis=0) coordmin = np.min(point_set_ini,axis=0) *nsubvolume_x = np.ceil((coordmax[0]-coordmin[0])/1.5).astype(np.int32) nsubvolume_y = np.ceil((coordmax[1]-coordmin[1])/1.5).astype(np.int32) point_sets = list() semantic_segs = list() sample_weights = list() isvalid = False for i in range(nsubvolume_x): for j in range(nsubvolume_y): curmin = coordmin+[i1.5,j1.5,0] curmax = coordmin+[(i+1)1.5,(j+1)1.5,coordmax[2]-coordmin[2]] curchoice = np.sum((point_set_ini>=(curmin-0.2))(point_set_ini<=(curmax+0.2)),axis=1)==3 cur_point_set = point_set_ini[curchoice,:] cur_semantic_seg = semantic_seg_ini[curchoice] if len(cur_semantic_seg)==0: continue mask = np.sum((cur_point_set>=(curmin-0.001))(cur_point_set<=(curmax+0.001)),axis=1)==3 choice = np.random.choice(len(cur_semantic_seg), self.npoints, replace=True) point_set = cur_point_set[choice,:] # Nx3 semantic_seg = cur_semantic_seg[choice] # N mask = mask[choice] if sum(mask)/float(len(mask))<0.01: continue sample_weight = self.labelweights[semantic_seg] sample_weight = mask # N point_sets.append(np.expand_dims(point_set,0)) # 1xNx3 semantic_segs.append(np.expand_dims(semantic_seg,0)) # 1xN sample_weights.append(np.expand_dims(sample_weight,0)) # 1xN point_sets = np.concatenate(tuple(point_sets),axis=0)**

oafolabi commented 6 years ago

init is the constructor for the class. According to https://micropyramid.com/blog/understand-self-and-__init__-method-in-python-class/ :

"init" is a reseved method in python classes. It is known as a constructor in object oriented concepts. This method called when an object is created from the class and it allow the class to initialize the attributes of a class. It gets called when you do create an object in that class, so in our case when you do something like: https://github.com/charlesq34/pointnet2/blob/7961e26e31d0ba5a72020635cee03aac5d0e754a/scannet/train.py#L72

It pretty much just reads the data file to get the scene points and labels. It also creates a histogram for all the classes telling us the number of points in this class. This histogram is used to create weights that may be used to ameliorate class inbalance issues during training by weighing the loss per class.

get item is also another pyhton function used to index elements. I'm not a python expert, but I believe it is called when we try to access an element of the class, jus tlike an array. So for example, when we do something like https://github.com/charlesq34/pointnet2/blob/7961e26e31d0ba5a72020635cee03aac5d0e754a/scannet/train.py#L358 i.e. we intend to get the data at index "batch_index".

In this case, the code itself breaks us the scene at the desired index into volumes of 1.5x1.5 in the x-y plane. Points are then sampled from each grid and the collection of all the grids is returned. Please refer to the Scannet experiment details in the supplementary part of the paper for more information. This is where 1.5 comes from.

As to where 0.2 comes from, that may have just been based on trial and error/ intuition to see what works best.

Hopefully that helps. Please remember to close the issue if that solves your problem.

Coastline2018 commented 6 years ago

@oafolabi Do you know about the "smpw" which is one of the labels in scannet dataset. It seems like a kind of weights. But I don't know what it really means, and how to produce it if I need to make a dataset by myself. Can you give me some tips?

oaa3wf commented 6 years ago

@Coastline2018 , as far as I can tell, "smpw" (sample weight) is very similar to labelweight, generated here. The lines below the link also tell you how to generate it. The label weights seem to be inversely proportional to the percentage of the sample's class in the whole dataset.

The function seems to be labelweights = 1/ln(x+ 1.2), where x is the percentage(number samples of that class/ total number of samples) of the label class of the point of interest. The authors may have used ln() to prevent getting a possible divide by zero error, and also added 1.2 so that the labelweights are guaranteed to be positive, but this is just my guess.

The only difference between "smpw" and labelweights that I've found so far is that smpw gets set to zero when you don't want a sample to count in the loss, say for example you have repeated points in the input to the network and don't want to count the repeated samples in the loss (it was done here).

It also gets set to 1 during test time, which makes sense.

This function, shows how it gets used in the loss.

Hope that helps. Disclaimer: Please note that I could be wrong. I am not an author of the paper and I do not know the authors personally. What I have described is my understanding from looking at the code.

Coastline2018 commented 6 years ago

@oaa3wf Thank you very much! It seems that I don't need to produce the "smpw" when I make the dataset.

XueDoudou commented 4 years ago

@oaa3wf, @Coastline2018 Hello, can you help me about this ? when I run the scannet, I got this problem:

EPOCH 000 2020-06-10 16:42:52.517771 -- 010 / 037 -- mean loss: 12.142558 accuracy: 0.076200 -- 020 / 037 -- mean loss: 10.209071 accuracy: 0.245123 -- 030 / 037 -- mean loss: 9.714067 accuracy: 0.372885 2020-06-10 16:43:38.244392 ---- EPOCH 000 EVALUATION ---- Traceback (most recent call last): File "train.py", line 436, in train() File "train.py", line 173, in train acc = eval_one_epoch(sess, ops, test_writer) File "train.py", line 305, in eval_oneepoch , uvlabel, _ = pc_util.point_cloud_label_to_surface_voxel_label_fast(aug_data[b,batch_smpw[b,:]>0,:], np.concatenate((np.expand_dims(batch_label[b,batch_smpw[b,:]>0],1),np.expand_dims(pred_val[b,batch_smpw[b,:]>0],1)),axis=1), res=0.02) File "/home/xdd/文档/pointNet/pointnet2-master/scannet/pc_util.py", line 40, in point_cloud_label_to_surface_voxel_label_fast coordmax = np.max(point_cloud,axis=0) File "/home/xdd/.conda/envs/tfdd_copy/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2505, in amax initial=initial) File "/home/xdd/.conda/envs/tfdd_copy/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, **passkwargs) ValueError: zero-size array to reduction operation maximum which has no identity 段错误 (核心已转储) It can't can't when the code calculate the EPOCH 000 EVALUATION, are you know why?