Closed jediofgever closed 3 years ago
Can you show me how batch_x
looks like before the crash?
https://colab.research.google.com/drive/1Vc5_yXZguGllK-ZQWgJj-4Us19bAyy_Z?usp=sharing
My collab file is here.
I am not sure what you mean by batch_x
It seems that DataLoader
cant properly load. I can correctly see train_dataset.data.pos
and train_dataset.data.y
train_dataset = UnevenGroundDataset(
root="/opt/uneven_ground_dataset/", transform=None, pre_transform=None
)
train_loader = DataLoader(train_dataset, batch_size=1, shuffle=True, num_workers=1)
DataLoader doesnt return properly, so the error complains about an empty tensor train_loader in train_loader
. But I cannot see why
There seems to be an example with no points in it. Is that true?
min_nodes = float('inf')
for data in train_dataset:
min_nodes = min(min_nodes, data.num_nodes)
print(min_nodes)
No,
that scripts produces
Data(pos=[4065298, 3], x=[4065298, 3], y=[4065298])
Intializing UnevenGroundDataset dataset
print(min_nodes): 461234
where pos is point locations, x is point normals, y is the label of each point
Can you test if if works to first put the required tensors to CPU before calling radius
here?
row, col = radius(pos.cpu(), pos[idx].cpu(), self.r, batch.cpu(), batch[idx].cpu(),
max_num_neighbors=64)
row, col = row.cuda(), col.cuda()
The dateset initialization was the problem. I wasn't transferring points to data.pos as torch tensor correctly. Now I can execute the training process. But network isnt learning anything in first 10 epochs. I have like 3-4 millions of labeled points
I down sampled the cloud and I get 0.92 accuracy at last epoch in the training phase. However when testing(with identical data) network cannot predict anything. The model is over fitting but is it normal to that I get no predictions at all to identical data?
Using the training data during inference should also yield 0.92 accuracy. If it does not do so, there might be some differences in the code regarding training and inference computation, e.g., induced by BatchNorm or Dropout.
❓ Questions & Help
Hello,
I am quite new to pointcloud learning. I have did some tutorials in pytorch_geometric but now I encounter something that i cant quite understand so I appriciate your help on this. I have large pointcloud maps that I use for navigation of robots, The pointclouds maps are generated and labeled from simulations. I want to train networks to segment derivable and non derivable regions. I created a Dataset for my purpose on my fork named ; uneven_ground_dataset.py
I also modified the pointnet2_segmentaion.py
When I start training I encounter following prolem;
I dont have a dedicted computer for DL at the moment I use minimal batch size. I searched for possible causes but I could not figure out why.
I have a few .pcd fle and I could provide them if you want to reproduce the issue.
Thank youu very much for your time.