Pointcept / PointTransformerV3

[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
MIT License
583 stars 30 forks source link

Preprocess custom data and run algorithm #34

Open fakurten94 opened 2 months ago

fakurten94 commented 2 months ago

Hello,

I have an aerial Lidar dataset and wanted to know your recommendation on how to preprocess it. I have seen the indoor and outdoor preprocessing but all of them have very different data structures compared to my own. My dataset is formatted like so:

data_folder
 |   - file1.txt
 |   - file2.txt
 |   - file3.txt
 |   - ...

Each .txt file has 4-12 million points, and for each point it has x, y, z, label, intensity parameters. I normally receive those files without r, g, b values but they can be added without any issues. But since I don't usually have the color values would it be necessary to use them to run the algorithm? I could add them no problem if they are necessary to run the model.

I was also confused between all the label variables. In my dataset, all labels that can be found are between this values [1,2,7,18,20] but many of the files may have subsets of those labels. I wasn't quite sure for my dataset what would be the class_label, sem_seg_label and the ins_seg_label.

Having said all that, what could be the best way to store this type of data? I imagine I could save the processed data into .pth files like in the DefaultDataLoader but not quite sure as each file has large number of points. And the structure inside of those files would be the same as this right?

"coord": [x, y, z]
"color": [r, g, b]
"class": class_label
"segment": sem_seg_label
"instance": ins_seg_label

My bad for all the question, just interested to get more in depth to be able to run your algorithm on my custom dataset!

Thanks you so much in advance!

Federico Kurten

Gofinge commented 2 months ago

Hi, here is the response:

would it be necessary to use them to run the algorithm?

No need to include color if it does not exist in your raw data. You can refer to our configs for the outdoor datasets (Check the Collect augmentation is dataset.transform)

I was also confused between all the label variables. In my dataset, all labels that can be found are between this values [1,2,7,18,20] but many of the files may have subsets of those labels.

Sorry that I can not help with that, as I can not suppose the meaning of the label file for your data with this information.

And the structure inside of those files would be the same as this right?

Almost correct. No need to include "color" and you can store "intensity" with a key name of "strength" to adapt to our codebase.

fakurten94 commented 2 months ago

@Gofinge thanks so much for your response! That is great news, I'll adapt the configs of one of the outdoor datasets to adapt it to not use color and store intensity as strength. I can also imagine that I would need to normalize or scale intensity to be between a specific range of values right? As intensity can have large integer values than can be > 10000.

My bad for not explaining myself correctly, I was just confused for my dataset how I could store the class_label, set_seg_label and the ins_seg_label as I don't know the difference between those variables. A typical .txt file could have the following structure:

x,y,z,intensity,label
1.44489182e+06, 7.56788840e+05, 4.95969000e+03, 1.80740000e+04, 1.00000000e+00
1.44489280e+06, 7.56788930e+05, 4.96159000e+03, 1.64000000e+04, 1.00000000e+00
1.44489525e+06, 7.56789190e+05, 4.95890000e+03, 2.47670000e+04, 2.00000000e+00
1.44489858e+06, 7.56789520e+05, 4.95844000e+03, 3.68160000e+04, 2.00000000e+00
.
.
.

Thank you!

Gofinge commented 2 months ago

I can also imagine that I would need to normalize or scale intensity to be between a specific range of values right?

Yes. Scale it to [0, 1].

I was just confused for my dataset how I could store the class_label, set_seg_label and the ins_seg_label as I don't know the difference between those variables.

I think you can refer to our preprocessing code (e.g. ScanNet). The key is to decouple each component instead of integrating them into one single 2D array.