Fail to compile the inference code in gapartnet and some issues about demo.ipynb

MingfeiShiMS commented 1 month ago

Dear Haoran:

I am reproducing GAPartNet. After long compilation in data rendering and processing, I can get the overall Gapartnet dataset.

Then I try compiling the inference code, "sh gapartnet/train.sh", there is a mistake:

Traceback (most recent call last): File "/home/junhuan/anaconda3/envs/gapartnet/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "/home/junhuan/anaconda3/envs/gapartnet/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/junhuan/anaconda3/envs/gapartnet/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/shimingfei/GAPartNet/gapartnet/dataset/gapartnet.py", line 86, in getitem file = apply_voxelization(file, voxel_size=self.voxel_size) File "/home/shimingfei/GAPartNet/gapartnet/dataset/gapartnet.py", line 193, in apply_voxelization voxel_features, voxelcoords, , pc_voxel_id = voxelize( File "/home/shimingfei/GAPartNet/epic_ops/epic_ops/voxelize.py", line 75, in voxelize batchindices, = expand_csr(voxel_batch_splits, voxel_coords.shape[0]) File "/home/junhuan/anaconda3/envs/gapartnet/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/shimingfei/GAPartNet/epic_ops/epic_ops/expand.py", line 13, in expand_csr return torch.ops.epic_ops.expand_csr( File "/home/junhuan/anaconda3/envs/gapartnet/lib/python3.8/site-packages/torch/_ops.py", line 569, in getattr raise AttributeError( AttributeError: '_OpNamespace' 'epic_ops' object has no attribute 'expand_csr'

It seems that there is a pytorch version mistake in epic_ops, and I install "torch==2.0.0" with cuda=11.8. And then I also try a different version "torch==1.11.0" but it does not work.

The second problem is about data structure. After data processing, the data structure is

akb48

gt
meta
path

but in practice, the demanding data structure is

data_test (just a name)

test_inter
test_intra
train
val

I wonder if there is a correspondence code or logic, thanks.

The last one is about demo.ipynb. I can compile the code but I wonder it is relevant to the train/inference code "gapartnet/train.py", or it is about data processing.

I am looking forward to your reply.

Best Regards

chengyzhao commented 1 month ago

Hi Mingfei,

Thanks for your interest and your questions.

About the first issue related to 'expand_csr', could you provide more details about your compilation, e.g. environment, package version, and any modification you performed.

As for the second issue, it seems like you just need to split the processed data into different splits for train and test.

MingfeiShiMS commented 4 weeks ago

Thank you for your kind reply! Conda environment is below:

Package Version Editable project location

absl-py 2.1.0 addict 2.4.0 asttokens 2.4.1 attrs 24.2.0 backcall 0.2.0 blinker 1.8.2 Brotli 1.0.9 cachetools 5.5.0 ccimport 0.4.4 certifi 2024.8.30 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 comm 0.2.2 ConfigArgParse 1.7 contourpy 1.1.1 cumm 0.4.11 cycler 0.12.1 dash 2.18.1 dash-core-components 2.0.0 dash-html-components 2.0.0 dash-table 5.0.0 decorator 5.1.1 docker-pycreds 0.4.0 docstring_parser 0.16 einops 0.8.0 epic-ops 0.1.0+8af5e3c /home/shimingfei/GAPartNet/epic_ops executing 2.1.0 fastjsonschema 2.20.0 filelock 3.13.1 fire 0.7.0 Flask 3.0.3 fonttools 4.54.1 fsspec 2024.9.0 gitdb 4.0.11 GitPython 3.1.43 gmpy2 2.1.2 google-auth 2.35.0 google-auth-oauthlib 1.0.0 grpcio 1.67.0 idna 3.7 importlib_metadata 8.5.0 importlib_resources 6.4.5 ipython 8.12.3 ipywidgets 8.1.5 itsdangerous 2.2.0 jedi 0.19.1 Jinja2 3.1.4 joblib 1.4.2 jsonargparse 4.33.2 jsonschema 4.23.0 jsonschema-specifications 2023.12.1 jupyter_core 5.7.2 jupyterlab_widgets 3.0.13 kiwisolver 1.4.7 kornia 0.7.3 kornia_rs 0.1.5 lark 1.2.2 lightning 2.3.3 lightning-utilities 0.11.8 llvmlite 0.41.1 Markdown 3.7 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.7.5 matplotlib-inline 0.1.7 mdurl 0.1.2 mkl-fft 1.3.8 mkl-random 1.2.4 mkl-service 2.4.0 mpmath 1.3.0 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.1 ninja 1.11.1.1 numba 0.58.1 numpy 1.24.3 oauthlib 3.2.2 open3d 0.18.0 opencv-contrib-python 4.10.0.84 opencv-python 4.10.0.84 packaging 24.1 pandas 2.0.3 parso 0.8.4 pccm 0.4.16 pexpect 4.9.0 pickleshare 0.7.5 pillow 10.4.0 pip 24.2 pkgutil_resolve_name 1.3.10 platformdirs 4.3.6 plotly 5.24.1 pointnet2 3.0.0 pointnet2_ops 3.0.0 /home/shimingfei/GAPartNet/pointnet2_ops_lib portalocker 2.10.1 prompt_toolkit 3.0.48 protobuf 5.28.2 psutil 6.1.0 ptyprocess 0.7.0 pure_eval 0.2.3 pyasn1 0.6.1 pyasn1_modules 0.4.1 pybind11 2.13.6 Pygments 2.18.0 pyparsing 3.1.4 pyquaternion 0.9.9 PySocks 1.7.1 python-dateutil 2.9.0.post0 pytorch-lightning 2.3.3 pytz 2024.2 PyYAML 6.0.2 referencing 0.35.1 requests 2.32.3 requests-oauthlib 2.0.0 retrying 1.3.4 rich 13.9.2 rpds-py 0.20.0 rsa 4.9 sapien 2.2.2 scikit-learn 1.3.2 scipy 1.10.1 sentry-sdk 2.17.0 setproctitle 1.3.3 setuptools 75.1.0 six 1.16.0 smmap 5.0.1 sparse 0.15.4 spconv 2.3.6 stack-data 0.6.3 sympy 1.13.2 tenacity 9.0.0 tensorboard 2.14.0 tensorboard-data-server 0.7.2 termcolor 2.4.0 threadpoolctl 3.5.0 torch 2.0.0 torchaudio 2.0.0 torchdata 0.6.0 torchmetrics 1.4.2 torchvision 0.15.0 tqdm 4.66.5 traitlets 5.14.3 transforms3d 0.4.2 triton 2.0.0 typeshed_client 2.7.0 typing_extensions 4.11.0 tzdata 2024.2 urllib3 2.2.3 wandb 0.18.5 wcwidth 0.2.13 Werkzeug 3.0.4 wheel 0.44.0 widgetsnbextension 4.0.13 zipp 3.20.2

I think the problem is in epic_ops, which is installed by "https://github.com/geng-haoran/epic_ops", instructed in GAPartNet README.md.

In epic_ops/epic_ops/expand.py,

@torch.no_grad()
def expand_csr(
    offsets: torch.Tensor,
    output_size: int,
) -> Tuple[torch.Tensor, torch.Tensor]:
    offsets = offsets.contiguous()

return torch.ops.epic_ops.expand_csr(
    offsets, output_size
)

AttributeError: '_OpNamespace' 'epic_ops' object has no attribute 'expand_csr'.

The similar problem is also in epic_ops/epic_ops/reduce.py,

@torch.no_grad()
def segmented_reduce(
    values: torch.Tensor,
    segment_offsets_begin: torch.Tensor,
    segment_offsets_end: torch.Tensor,
    mode: str = "sum",
) -> torch.Tensor:
    values = values.contiguous()
    segment_offsets_begin = segment_offsets_begin.contiguous()
    segment_offsets_end = segment_offsets_end.contiguous()
    if mode == "sum":
        mode_id = 0
    elif mode == "min":
        mode_id = 1
    elif mode == "max":
        mode_id = 2
    else:
        raise ValueError(f"Unknown mode: {mode}")
    return torch.ops.epic_ops.segmented_reduce(
        values, segment_offsets_begin, segment_offsets_end, mode_id
    )

def segmented_maxpool(
    values: torch.Tensor,
    segment_offsets_begin: torch.Tensor,
    segment_offsets_end: torch.Tensor,
) -> Tuple[torch.Tensor, torch.Tensor]:
    values = values.contiguous()
    segment_offsets_begin = segment_offsets_begin.contiguous()
    segment_offsets_end = segment_offsets_end.contiguous()
    return torch.ops.epic_ops.segmented_maxpool(
        values, segment_offsets_begin, segment_offsets_end
    )

AttributeError: '_OpNamespace' 'epic_ops' object has no attribute 'segmented_maxpool', 'segmented_reduce'.

I guass the above function is overwrited by writers, which is irrelevant to pytorch versions.

I am looking forward to your reply.

Best Regards

PKU-EPIC / GAPartNet

Fail to compile the inference code in gapartnet and some issues about demo.ipynb #19