nv-tlabs / XCube

[CVPR 2024 Highlight] XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies
https://research.nvidia.com/labs/toronto-ai/xcube/
Other
355 stars 20 forks source link

Sanitize pytorch extension name #12

Closed yanis-falaki closed 4 months ago

yanis-falaki commented 4 months ago

https://github.com/nv-tlabs/XCube/blob/01ac938746c5e2f8ac627730eb9c0dc7f46aea80/ext/__init__.py#L28

The above line of code created errors when I was running sample_objaverse.py (I assume it would the others too, however I only confirmed with objaverse). The way the name argument is created allows for illegal characters ('+') to be introduced, and stops code execution. This trivial error costed quite a bit of time, but was easily fixed just by sanitizing the name before passing it as an argument, so the code block becomes:

    # Sanitize the name to avoid special characters
    sanitized_name = name.replace(".", "_").replace("+", "_")

    # Constructing the extension name, removing special characters
    extension_name = f"xcube_torch_{torch.__version__}".replace(".", "_").replace("+", "_") + "_" + sanitized_name

    return load(
        name=extension_name,
        sources=list(cpp_files) + list(cu_files) + [base_path / t for t in additional_files],
        verbose='COMPILE_VERBOSE' in os.environ.keys(),
        **kwargs
    )

Side note, the instructions given at #2 for adding fvdb should probably be added to the readme, as well as the fix for #include <fvdb/GridBatch.h> not being recognized, as I couldn't get this repo working without it.

below is the Traceback.

Details

(xcube) paperspace@ps591xs0g71q:~/XCube$ python inference/sample_objaverse.py none --batch_len 1 --ema --use_ddim --ddim_step 100 --extract_mesh /home/paperspace/miniconda3/envs/xcube/lib/python3.10/site-packages/pytorch_lightning/utilities/distributed.py:258: LightningDeprecationWarning: pytorch_lightning.utilities.distributed.rank_zero_only has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from pytorch_lightning.utilities instead. rank_zero_deprecation( Traceback (most recent call last): File "/home/paperspace/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2107, in _run_ninja_build subprocess.run( File "/home/paperspace/miniconda3/envs/xcube/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/paperspace/XCube/inference/sample_objaverse.py", line 77, in net_model = create_model_from_args(config_coarse, ckpt_coarse).cuda() File "/home/paperspace/XCube/inference/sample_objaverse.py", line 47, in create_model_from_args net_model = net_module.load_from_checkpoint(args_ckpt, hparams=model_args, strict=strict) File "/home/paperspace/miniconda3/envs/xcube/lib/python3.10/site-packages/pytorch_lightning/core/saving.py", line 139, in load_from_checkpoint return _load_from_checkpoint( File "/home/paperspace/miniconda3/envs/xcube/lib/python3.10/site-packages/pytorch_lightning/core/saving.py", line 188, in _load_from_checkpoint return _load_state(cls, checkpoint, strict=strict, **kwargs) File "/home/paperspace/miniconda3/envs/xcube/lib/python3.10/site-packages/pytorch_lightning/core/saving.py", line 234, in _load_state obj = cls(**_cls_kwargs) File "/home/paperspace/XCube/xcube/models/diffusion.py", line 84, in __init__ self.vae = self.load_first_stage_from_pretrained().eval() File "/home/paperspace/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/paperspace/XCube/xcube/models/diffusion.py", line 261, in load_first_stage_from_pretrained net_module = importlib.import_module("xcube.models." + model_args.model).Model File "/home/paperspace/miniconda3/envs/xcube/lib/python3.10/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/home/paperspace/XCube/xcube/models/autoencoder.py", line 24, in from xcube.modules.autoencoding.losses.base_loss import Loss File "/home/paperspace/XCube/xcube/modules/autoencoding/losses/base_loss.py", line 16, in from xcube.utils.color_util import color_from_points, semantic_from_points File "/home/paperspace/XCube/xcube/utils/color_util.py", line 1, in from ext import common File "/home/paperspace/XCube/ext/__init__.py", line 34, in common = load_torch_extension( File "/home/paperspace/XCube/ext/__init__.py", line 27, in load_torch_extension return load( File "/home/paperspace/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1309, in load return _jit_compile( File "/home/paperspace/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1719, in _jit_compile _write_ninja_file_and_build_library( File "/home/paperspace/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1832, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/paperspace/.local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2123, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'xcube_torch_2_3_1+cu121common': [1/2] /home/paperspace/miniconda3/envs/xcube/bin/x86_64-conda-linux-gnu-c++ -MMD -MF bind.o.d -DTORCH_EXTENSION_NAME=xcube_torch_2_3_1+cu121common -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include/TH -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include/THC -isystem /home/paperspace/miniconda3/envs/xcube/include -isystem /home/paperspace/miniconda3/envs/xcube/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O2 -c /home/paperspace/XCube/ext/common/bind.cpp -o bind.o FAILED: bind.o /home/paperspace/miniconda3/envs/xcube/bin/x86_64-conda-linux-gnu-c++ -MMD -MF bind.o.d -DTORCH_EXTENSION_NAME=xcube_torch_2_3_1+cu121common -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include/TH -isystem /home/paperspace/.local/lib/python3.10/site-packages/torch/include/THC -isystem /home/paperspace/miniconda3/envs/xcube/include -isystem /home/paperspace/miniconda3/envs/xcube/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O2 -c /home/paperspace/XCube/ext/common/bind.cpp -o bind.o In file included from /home/paperspace/.local/lib/python3.10/site-packages/torch/include/pybind11/attr.h:13, from /home/paperspace/.local/lib/python3.10/site-packages/torch/include/pybind11/detail/class.h:12, from /home/paperspace/.local/lib/python3.10/site-packages/torch/include/pybind11/pybind11.h:13, from /home/paperspace/.local/lib/python3.10/site-packages/torch/include/torch/csrc/Exceptions.h:12, from /home/paperspace/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/python.h:11, from /home/paperspace/.local/lib/python3.10/site-packages/torch/include/torch/extension.h:9, from /home/paperspace/XCube/ext/common/bind.cpp:1: : error: expected initializer before '+' token /home/paperspace/XCube/ext/common/bind.cpp:16:17: note: in expansion of macro 'TORCH_EXTENSION_NAME' 16 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { | ^~~~~~~~~~~~~~~~~~~~ : error: expected initializer before '+' token /home/paperspace/XCube/ext/common/bind.cpp:16:17: note: in expansion of macro 'TORCH_EXTENSION_NAME' 16 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { | ^~~~~~~~~~~~~~~~~~~~ : error: expected initializer before '+' token /home/paperspace/XCube/ext/common/bind.cpp:16:17: note: in expansion of macro 'TORCH_EXTENSION_NAME' 16 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { | ^~~~~~~~~~~~~~~~~~~~ : error: expected initializer before '+' token /home/paperspace/XCube/ext/common/bind.cpp:16:17: note: in expansion of macro 'TORCH_EXTENSION_NAME' 16 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { | ^~~~~~~~~~~~~~~~~~~~ : error: expected initializer before '+' token /home/paperspace/XCube/ext/common/bind.cpp:16:17: note: in expansion of macro 'TORCH_EXTENSION_NAME' 16 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { | ^~~~~~~~~~~~~~~~~~~~ ninja: build stopped: subcommand failed.

xrenaa commented 4 months ago

Hi, thanks for your question and suggestion. I update the files to fix the issue.