Fix various bugs and added support for using podman

DonMaEn commented 3 weeks ago

Thanks for publishing this code and pre-trained model!

I managed to get this partially working on my redhat 8 machine using podman as the containerization software. This pull request contains some changes I had to made to get this to run. It seems like maybe the version of this code that was published to github is missing a few files, namely the large_PC_predict.sh file and nibio_inference/final_utm_integration_in_folders.py. I'm not sure if these files are entirely necessary though, because I was able to get output instance segmentation files without them.

Right now the version of the run_inference.sh script I pushed will fail because of the missing files, but if those steps are not necessary maybe they can just be removed.

DonMaEn commented 3 weeks ago

When I initially made this request I didn't realize some of the things I pointed out were already fixed by a recent commit. I think my pull request still contains some necessary changes for the inference pipeline to work though, at least on my machine. Mainly the change to the treeins.py dataset.

DonMaEn commented 3 weeks ago

Testing this on my local machine with a single point cloud yields this error:

Merging point cloud, semantic segmentation and instance segmentation.
  0%|                                                                                                                                                                                        | 0/1 [16:22<?, ?it/s]
Traceback (most recent call last):
  File "/home/nibio/mutable-outside-world/nibio_inference/merge_pt_ss_is_in_folders.py", line 98, in <module>                                                                                                     
    merge_pt_ss_is_in_folders = MergePtSsIsInFolders(
  File "/home/nibio/mutable-outside-world/nibio_inference/merge_pt_ss_is_in_folders.py", line 73, in __call__                                                                                                     
    return self.merge()
  File "/home/nibio/mutable-outside-world/nibio_inference/merge_pt_ss_is_in_folders.py", line 58, in merge                                                                                                        
    MergePtSsIs(
  File "/home/nibio/mutable-outside-world/nibio_inference/merge_pt_ss_is.py", line 196, in __call__
    return self.run()
  File "/home/nibio/mutable-outside-world/nibio_inference/merge_pt_ss_is.py", line 186, in run
    self.save(merged_df)
  File "/home/nibio/mutable-outside-world/nibio_inference/merge_pt_ss_is.py", line 162, in save
    pandas_to_las(
  File "/home/nibio/mutable-outside-world/nibio_inference/pandas_to_las.py", line 122, in pandas_to_las
    las_file[column] = df[column].astype(extended_columns_with_data_types[column])
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py", line 5548, in astype
    new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors,)
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/internals/managers.py", line 604, in astype
    return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/internals/managers.py", line 409, in apply
    applied = getattr(b, f)(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/internals/blocks.py", line 595, in astype
    values = astype_nansafe(vals1d, dtype, copy=True)
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py", line 968, in astype_nansafe
    raise ValueError("Cannot convert non-finite values (NA or inf) to integer")
ValueError: Cannot convert non-finite values (NA or inf) to integer

dluks commented 4 days ago

The issue is that there are NaNs present in the final merged DataFrame (see merge_pt_ss_is.py). MergePtSsIs.save then calls pandas_to_las which tries to cast most columns to integers, which fails when missing values are present since they are floats.

In my test case, the only column that contains any missing values after segmentation is PredInstance. I'm not familiar enough with the segmentation process, though---does it make sense that the instance segmentation would provide null predictions?

If so, then a specific class should probably be reserved for NaNs and then they could just be filled with that class prior to casting to integers.

If not, then it seems there's still a bug in either the code that does the merging or the segmentation process somewhere...

dluks commented 4 days ago

OK, the issue is at least in part due to the outer join performed in merge_pt_ss_is.py. With my data there were no missing values present in point_cloud_df, semantic_segmentation_df, or instance_segmentation_df, so the segmentation process isn't producing them.

However, instance_segmentation_df was smaller than the other two for some reason. It would be interesting to hear from someone more familiar with the segmentation process why the instance segmentation might not be returning all of the input points present for the semantic segmentation, as it doesn't seem to be the case for everyone (see this comment).

I'm guessing this is just because it wasn't able to assign an instance ID to every point? As a fix, I created a separate PR (to hopefully be merged into this one) that simply assigns 0 to all points that don't have an instance segmentation ID.

SmartForest-no / SegmentAnyTree

Fix various bugs and added support for using podman #9