nerfstudio-project / gsplat

CUDA accelerated rasterization of gaussian splatting
https://docs.gsplat.studio/
Apache License 2.0
1.36k stars 164 forks source link

Simple trainer example fails to run #241

Open sleeplessai opened 1 month ago

sleeplessai commented 1 month ago

(splat) PS C:\Users\tomo\Desktop\gsplat\examples> python .\simple_trainer.py Aborted at 1719492847 (unix time) try "date -d @1719492847" if you are using GNU date @ 0x7ffa39464172 log2f @ 0x7ff7b5e61f58 OPENSSL_Applink @ 0x7ffa347aecc0 C_specific_handler @ 0x7ffa3bc7504f chkstk @ 0x7ffa3bbee866 RtlFindCharInUnicodeString @ 0x7ffa3bc7403e KiUserExceptionDispatcher @ 0x7ffa196c25ee void cdecl ExceptionPtrRethrow(void const ptr64) @ 0x7ff8b9198ebe public: void cdecl c10::ivalue::Future::markCompleted(void) __ptr64 @ 0x7ff8b95283c2 struct _object ptr64 cdecl THPGenerator_initDefaultGenerator(struct at::Generator) @ 0x7ff8b173dea5 (unknown) @ 0x7ff8b1c96b29 PyInit_pycolmap @ 0x7ffa332c1080 (unknown) @ 0x7ffa332c26a5 __NLG_Return2 @ 0x7ffa3bc74896 RtlCaptureContext2 @ 0x7ff8b17409b9 (unknown) @ 0x7ff9b7bf82f6 PyCFunction_GetFlags @ 0x7ff9b7bb554c _PyObject_MakeTpCall @ 0x7ff9b7bbe4ab PyComplex_AsCComplex @ 0x7ff9b7bfcbb4 _PyObject_GenericGetAttrWithDict @ 0x7ff9b7bfc438 PyObject_GetAttr @ 0x7ff9b7bfc089 PyObject_GetAttrString @ 0x7ff8b177609e PyInit_pycolmap @ 0x7ff8b17cdc6d PyInit_pycolmap @ 0x7ff8b17a651d PyInit_pycolmap @ 0x7ff8b1764170 (unknown) @ 0x7ff8b1770a48 PyInit_pycolmap @ 0x7ff8b177054e PyInit_pycolmap @ 0x7ff9b7cded88 _PyImport_GetModuleAttrString @ 0x7ff9b7cde57c PyImport_Import @ 0x7ff9b7bf7f88 PyCFunction_GetFlags @ 0x7ff9b7cae9cc PyEval_GetFuncDesc @ 0x7ff9b7ca9d19 _PyEval_EvalFrameDefault

The error may be caused by the pycolmap package. I successfully ran another fitting_image example on the same Windows machine. It proved the gsplat core function works pretty well as expected. I also tested that commenting the importer line of pycolmap package the error will not happen. Maybe changing the codebase of data parsing by using a plain self-implementation version without using pycolmap might solve this error. I suggest using the original data parser in the original 3DGS and put the code in utils folder. Anyway, I can't start new work based on the simple_trainer example right now. Sad for this could not run on my Windows workstation.

Atticuszz commented 1 month ago

try wsl?run Ubuntu in windows as subsystem

sleeplessai commented 4 weeks ago

@Atticuszz Thanks for the reply. I guess that it should work as you commented. But the fact is that I am working on with a native Windows pipeline, so wsl2 is not be considered in my underlying project. Do you have any other ideas to solve this error (if there is no need to switch codebase)?

Atticuszz commented 4 weeks ago

Hello how did u install the pycolmap?i remember the the requirementst.txt offers a git link,switching to normal version with pip insall might be work?

first lines of ur logs shows that seems have problems with window time?is that system time correct? my Windows time went wrong as I install dual system in one machine before

or try to search in pycolmap issues with logs?

or replace pycolmap with other alternatives and the then modify the code interface in class Parser for adapting the new one

hope these suggestions would help !

sleeplessai commented 3 weeks ago

@Atticuszz I reinstalled the pycolmap package with the alongside requirement.txt file. The sudden interruption error has been solved, but another bug came out which caused a failure to read the colmap binary data file.

For an instance,

  File "<MyWindowsPath>\gsplat\examples\datasets\colmap.py", line 50, in __init__
    manager.load_cameras()
  File "<MyWindowsPath>\miniconda3\envs\splat\lib\site-packages\pycolmap\scene_manager.py", line 90, in load_cameras
    self._load_cameras_bin(input_file)
  File <MyWindowsPath>\miniconda3\envs\splat\lib\site-packages\pycolmap\scene_manager.py", line 103, in _load_cameras_bin
    num_cameras = struct.unpack('L', f.read(8))[0]
struct.error: unpack requires a buffer of 4 bytes

This issue can refer to the same one in: https://github.com/nerfstudio-project/nerfacc/issues/104

To fix this, I had to change the 3-line codes in the scene_manager.py https://github.com/rmbrualla/pycolmap/blob/master/pycolmap/scene_manager.py#L102 https://github.com/rmbrualla/pycolmap/blob/master/pycolmap/scene_manager.py#L143 https://github.com/rmbrualla/pycolmap/blob/master/pycolmap/scene_manager.py#L231 from num_images = struct.unpack('L', f.read(8))[0] to num_images = struct.unpack('Q', f.read(8))[0]

Then, finally, it worked out a normal result with the decent quantitative measures.

image

I suggest this tiny fix should be added as a tip in document to help those developers work on Windows. Thanks for your patience.

animesh-77 commented 2 weeks ago

@sleeplessai

Thanks for your suggestion. I was facing same issues when trying on my Windows system and was reluctant to shift to WSL. In addition to your 3-line changes I had to change the following line in scene_manager.py as well to finally get it to run

  1. https://github.com/rmbrualla/pycolmap/blob/master/pycolmap/scene_manager.py#L105 'IiLL' to 'IiQQ' (first character is a capital "i")

The orignal source file at https://github.com/colmap/colmap/blob/main/scripts/python/read_write_model.py helped as well.

Thank you.