issue accessing the partitioned pointclouds

nimafo commented 11 months ago

I have another issue in the next cells of the same notebook (pointnet_seg.ipynb) from this code:

total_train_targets = []
for (_, targets) in train_dataloader:
    total_train_targets += targets.reshape(-1).numpy().tolist()

total_train_targets = np.array(total_train_targets)

I get the following error:

I did not open a new issue for it not to spam. But tell me if it helps and I will put this in a new one.

KeyError Traceback (most recent call last) /SemanticSegmentation/3D/point_net/pointnet_seg.ipynb Cell 13 line 2 1 total_train_targets = [] ----> 2 for (_, targets) in train_dataloader: 3 print(1) 4 #total_train_targets += targets.reshape(-1).numpy().tolist() 5 6 #total_train_targets = np.array(total_train_targets)

File ~/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py:633, in _BaseDataLoaderIter.next(self) 630 if self._sampler_iter is None: 631 # TODO(https://github.com/pytorch/pytorch/issues/76750) 632 self._reset() # type: ignore[call-arg] --> 633 data = self._next_data() 634 self._num_yielded += 1 635 if self._dataset_kind == _DatasetKind.Iterable and \ 636 self._IterableDataset_len_called is not None and \ 637 self._num_yielded > self._IterableDataset_len_called:

File ~/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py:677, in _SingleProcessDataLoaderIter._next_data(self) 675 def _next_data(self): 676 index = self._next_index() # may raise StopIteration --> 677 data = self._dataset_fetcher.fetch(index) # may raise StopIteration 678 if self._pin_memory: 679 data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)

File ~/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py:51, in _MapDatasetFetcher.fetch(self, possibly_batched_index) 49 data = self.dataset.getitems(possibly_batched_index) 50 else: ---> 51 data = [self.dataset[idx] for idx in possibly_batched_index] 52 else: 53 data = self.dataset[possibly_batched_index]

File ~/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py:51, in (.0) 49 data = self.dataset.getitems(possibly_batched_index) 50 else: ---> 51 data = [self.dataset[idx] for idx in possibly_batched_index] 52 else: 53 data = self.dataset[possibly_batched_index]

File /SemanticSegmentation/3D/point_net/s3dis_dataset.py:53, in S3DIS.getitem(self, idx) 51 def getitem(self, idx): 52 # read data from hdf5 ---> 53 space_data = pd.read_hdf(self.data_paths[idx], key='space_slice').to_numpy() 54 points = space_data[:, :3] # xyz points 55 targets = space_data[:, 3] # integer categories

File ~/.local/lib/python3.10/site-packages/pandas/io/pytables.py:446, in read_hdf(path_or_buf, key, mode, errors, where, start, stop, columns, iterator, chunksize, **kwargs) 441 raise ValueError( 442 "key must be provided when HDF5 " 443 "file contains multiple datasets." 444 ) 445 key = candidate_only_group._v_pathname --> 446 return store.select( 447 key, 448 where=where, 449 start=start, 450 stop=stop, 451 columns=columns, 452 iterator=iterator, 453 chunksize=chunksize, 454 auto_close=auto_close, 455 ) 456 except (ValueError, TypeError, KeyError): 457 if not isinstance(path_or_buf, HDFStore): 458 # if there is an error, close the store if we opened it.

File ~/.local/lib/python3.10/site-packages/pandas/io/pytables.py:841, in HDFStore.select(self, key, where, start, stop, columns, iterator, chunksize, auto_close) 839 group = self.get_node(key) 840 if group is None: --> 841 raise KeyError(f"No object named {key} in the file") 843 # create the storer and axes 844 where = _ensure_term(where, scope_level=1)

KeyError: 'No object named space_slice in the file'

I get <KeysViewHDF5 ['space_data']> from the following code:

h5_path = os.path.join(R"/S3DIS/Stanford3dDataset_v1.2_Reduced_Aligned_Version/Area_1/conferenceRoom_1.hdf5")
with h5py.File(h5_path, 'r') as f:
    print(f.keys())

Originally posted by @nimafo in https://github.com/itberrios/3D/issues/1#issuecomment-1729640930

itberrios commented 11 months ago

Hello,

It looks like you're not using the partitioned point clouds for this dataloader, I had to partiton the point clouds into smaller units to avoid memory issues. Did you make point clouds with this folder structure? "S3DIS\Stanford3dDataset_v1.2_Reduced_Partitioned_Aligned_Version", the partitioned files should be saved with names like: conferenceRoom_1_partition1_.hdf5, conferenceRoom_1_partition2_.hdf5, conferenceRoom_1_partition3_.hdf5, ...

The key for the original areas is: "space_data", while the key for the partitioned areas are: "space_slice"

If not please see this notebook and let me know if you have anymore questions.

itberrios commented 11 months ago

Going to close this, please reopen if there is still an issue.

itberrios / 3D

issue accessing the partitioned pointclouds #2