Closed JeppeVHolm closed 10 months ago
Hi, I am super happy someone is trying to integrate SensatUrban in the framework, I was hoping to do it at some point. I would gladly welcome a PR in the end once you have things working !
Normally, if you followed the existing code's style, you should only have to create the following:
SensatUrbanDataModule
inheriting from BaseDataModule
in `src/datamodules/sensaturban.py SensatUrbanDataset
inheriting from BaseDataset
in `src/datasets/sensaturban.py, along with its reader function to parse raw data filessrc/datasets/sensaturban_config.py
config/datamodules/sensaturban.yaml
Is that what you have ? Have you made any other modifications to the project ?
In particular, can you please share the code for:
Since you checked #15 I am assuming you checked your semantic labels are dense in [0, N[
and only use the N
label for unclassified/unlabeled/ignored points ?
Hi, another student from the same project group here.
Thank you so much for your response!
We didn't take our starting point in your base files as we are still newbies with regard to coding. Instead, we copied each of the files you mentioned from Dales and adjusted these to Sensat.
Besides the files you are mentioning, we also added a line to src/datasets/_init_.py
as you can see:
We have pushed our work at its current state to the following repository, where you can check our files: https://github.com/cell1604/superpoint_transformer_P9/tree/spt_sensaturban
With regard to #15 : We went through all data files from SensatUrban to make sure they are annotated correctly. In this process we found some error in 3 files resulting that we couldn't open them, so we tried to remove these from the training data (didn't work). Furthermore we discovered that the data in "test" were not annotated and hence has no semantic labels (we assume that the test data should also have such?), so we also tried replacing the test data with data that were annotated - this didn't resolve the issue either.
As you may know, Sensat has 13 classes ranging from [0, 12]
, but does not have a class specifically for "Unknown" points - however, we have experimented forth and back trying to adding and removing a such class trying to figure out if it is a criteria in SPT? However we have not had any luck regardless of what we've tried. Maybe you can clarify how to configure this correctly?
In #15, you suggested adding this line in data/data.py
in order to check for empty tensors or something like that I figure?:
assert a.sum(dim=1).gt(0).all(), f"Some points in the label histogram 'self.y' do not have any labels."
However, we are not quite sure how to use this, as it seems to require "a" to be a defined tensor of some sort?
Looking forward hearing back from you! Regards, Cellina
we are still newbies with regard to coding.
I must warn you that this project is not the easiest to get started with. Making modifications to the project as a whole will require that you are proficient in machine learning in general and 3D deep learning in particular, and at ease with the following: python, torch, torch-lightning, torch-geometric, hydra. I will try to give you pointers to help you setup SensatUrban, but I won't be able to provide detailed support for things that I did not code and release myself.
we copied each of the files you mentioned from Dales and adjusted these to Sensat.
That is fine, it is a good starting point for SensatUrban.
we found some error in 3 files resulting that we couldn't open them, so we tried to remove these from the training data (didn't work).
This is quite strange for an officially-released dataset like SensatUrban. Have you checked that the files are not corrupt, weird characters, etc ? Maybe the official github repo or repos of other people using that dataset might tell you if someone else encountered this issue.
Furthermore we discovered that the data in "test" were not annotated and hence has no semantic labels (we assume that the test data should also have such?), so we also tried replacing the test data with data that were annotated - this didn't resolve the issue either.
This is normal, SensatUrban uses the test files to evaluate methods through an official benchmarking server. The test labels are held-out, you must define a validation set for your own experiments and only use the test set for submitting your predictions to the SensatUrban server. See the KITTI-360
dataset to get an idea of how to deal with train/val/test with held-out test labels.
As you may know, Sensat has 13 classes ranging from
[0, 12]
, but does not have a class specifically for "Unknown" points - however, we have experimented forth and back trying to adding and removing a such class trying to figure out if it is a criteria in SPT? However we have not had any luck regardless of what we've tried. Maybe you can clarify how to configure this correctly?
Your settings seem fine in the screenshot. The "unknown" class is not necessary unless some points in SensatUrban have labels outside of [0, 12]
(I haven't checked myself). Some datasets use this type of extra class to indicate that the loss and metrics should not be computed on these "unlabeled"/"ignored"/unknown" points. Just for safety, you can keep SENSAT_NUM_CLASSES=13
and append 'ignored'
to the CLASS_NAMES
and [0, 0, 0]
to CLASS_COLORS
.
In https://github.com/drprojects/superpoint_transformer/issues/15, you suggested adding this line in data/data.py in order to check for empty tensors or something like that I figure?: assert a.sum(dim=1).gt(0).all(), f"Some points in the label histogram 'self.y' do not have any labels." However, we are not quite sure how to use this, as it seems to require "a" to be a defined tensor of some sort?
Yes, you just need to adapt the line to the context:
elif k == 'y' and val.dim() > 1 and y_to_csr:
assert val.sum(dim=1).gt(0).all(), f"Some points in the label histogram 'self.y' do not have any labels."
sg = f.create_group(osp.join(f.name, '_csr_', k))
save_dense_to_csr(val, sg, fp_dtype=fp_dtype)
At this point in the code, we are saving preprocessed Data
objects to disk. The 'y'
attribute of the Data
objects is used to handle the semantic labels. These labels can either be stored as a simple 1D Tensor
or as 2D Tensor
, in which case they are representing the histogram of labels for a voxel or a superpoint (we keep track of all the labels of the raw points inside the said voxel of superpoint). In the snippet above, I suggest you temporarily add a line to check if there are any issues with the computed label histograms. Specifically, it will throw an error if one of the voxels/superpoints has an empty histogram. This should normally never happen. If it does, it means there is probably an upstream error related to the labels. For instance, some points in your raw data have labels which are not in [0, 12]
. Given what you mentioned above, depending how you read the test
files which have no semantic labels, you might have caused some downstream errors too. In particular, when reading tiles for the test
set, make sure the output of your raw data reader function read_dales_tile()
(or read_sensaturban_tile()
, however you called it) is a Data
object which has no 'y' attribute. Again, you can get inspiration from the KITTI-360
dataset for how to handle test
data.
Good luck and happy coding
PS: if you are using and like the project, don't forget to all give us a :star:, it matters to us !
Without further reply from you, I consider this issue solved and am closing it.
PS: if you are using and like the project, don't forget to all give us a ⭐, it matters to us !
Not sure if this helps anyone but I noticed the mentioned error also occurs if the conditions mentioned by @drprojects are met (e.g. labels in range [0,C]; all labels present at least once) but for some reason 1D labels are passed as a 2D array to the Data
object with shape [N, 1], which is what you get if you e.g. read a ply file as pandas dataframe and pass a 'labels' column. I was able to fix this with .reshape(-1)
on the labels.
Hi Damian,
Thanks for SPT! We are three university students trying to use SPT to classify point clouds.
We have been looking into incorporating the SensatUrban dataset into SPT. It seems as if the data read correctly, as shown in the picture above.
We tried looking into issue #15 to resolve it, but it did not work.
Would you happen to have any suggestions on how to resolve this issue? We have been through the guide for setting up our own dataset several times and have found a lot of inspiration in how dales are setup.
Looking forward to hearing from you. Best regards Jeppe