Closed Holmes-GU closed 1 year ago
Hi. I am a little confused why you are seeing 4 channels for xyz. Mine has 3.
Hi. I am a little confused why you are seeing 4 channels for xyz. Mine has 3.
Hi, I follow 'python3 src/example.py' for quick start. In example.py, the xyz is processed by ME.utils.batched_coordinates(), therefore, the channel becomes 4, as shown below. Besides, 'demo/pc.ply' does not exist and it should be 'demo/owl.ply' instead. And in the training file, this function seems not to be used.
Hi, which checkpoint are you using? The example.py is for spconv. If you are using a pointbert checkpoint, some modifications are needed. Sorry for the confusion.
Hi, which checkpoint are you using? The example.py is for spconv. If you are using a pointbert checkpoint, some modifications are needed. Sorry for the confusion.
I have changed the backbone to pointbert with model.scaling=4, model.name=PointBert and model.use_dense=True. Would you like to provide me with exact modifications? Thanks.
Hi, please refer to the codes:
Basically, to use PointBert, you don't need to process the PC with MinkowskiEngine.
Hi, please refer to the codes:
Basically, to use PointBert, you don't need to process the PC with MinkowskiEngine.
Ok. Thanks for your instructions. I will try it tomorrow~
Hi. Following your instructions, I successfully run the codes.
Sorry for bothering you again. I observe some differences in processing data in training and testing as follows:
Do you mean this line? This is an augmentation, which randomly change the colors of some shapes to a constant (0.4).
This is due to some inconsistency when preparing the data files. RGB in ObjaverseLVIS and ScanObjectNN are in [0,1]. ModelNet40 don't have colors and we put 100 for all data files. 'rgb = rgb / 255.0' is just to normalize them to 0.4.
OK. What about this part? Does ScanObjectNNTest not own colors and directly set it as a constant (0.4)?
Yes.
Yes.
OK, thank you very much.
哈喽,再问一下哦,这个地方为啥只取x[:,0],而不是x呢?
哈喽,再问一下哦,这个地方为啥只取x[:,0],而不是x呢? 这个x[:,0]是class token吧,然后剩下的384是聚合以后点的个数吧?
Because we pick the first token, which is the CLS token, as the pooler output, as in any transformer encoder architecture (including but not limited to BERT, CLIP ViT, PointBERT).
Hi, thanks for your codes. I have tried to run python src/examples. However, it returns 'RuntimeError: Given groups=1, weight of size [64, 9, 1, 1], expected input[1, 10, 64, 384] to have 9 channels, but got 10 channels instead', deriving from self.mlp of function 'PointNetSetAbstraction' in modes/pointnet_util.py. The source maybe lie in the channel of xyz (4 in example.py). Is there any solutions?
Thank you very much.