google-research / multinerf

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF
Apache License 2.0
3.56k stars 338 forks source link

Render Image - Recursion Depth Limits? #122

Closed nackjaylor closed 1 year ago

nackjaylor commented 1 year ago

Hi team,

It seems some updates to either JAX or Linux in some way shape or form is playing havoc with the render code provided in MultiNeRF. As it stands, I am able to train, but can no longer render/eval from NeRFs using this codebase.

Is anyone else facing similar issues? I've been able to replicate across 2 different machines and from code running in a container.

Any chance you could take a look into the inference code and see what might be the probable cause?

Traceback (most recent call last):
  File "/home/jack/miniconda3/envs/multinerf_scratch/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/jack/miniconda3/envs/multinerf_scratch/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/jack/code/testing/multinerf/eval.py", line 275, in <module>
    app.run(main)
  File "/home/jack/miniconda3/envs/multinerf_scratch/lib/python3.9/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/jack/miniconda3/envs/multinerf_scratch/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/home/jack/code/testing/multinerf/eval.py", line 110, in main
    rendering = models.render_image(
  File "/home/jack/code/testing/multinerf/internal/models.py", line 738, in render_image
    chunk_renderings, _ = render_fn(rng, chunk_rays)
  File "/home/jack/code/testing/multinerf/internal/train_utils.py", line 499, in render_eval_fn
    return jax.lax.all_gather(
  File "/home/jack/miniconda3/envs/multinerf_scratch/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/home/jack/miniconda3/envs/multinerf_scratch/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/home/jack/miniconda3/envs/multinerf_scratch/lib/python3.9/site-packages/jax/_src/config.py", line 613, in update_thread_local_jit_state
    tls.extra_jit_context = _thread_local_state_cache.canonicalize(tmp)
RecursionError: maximum recursion depth exceeded

Cheers,

Jack

nackjaylor commented 1 year ago

This seems to have fixed itself... I shall close this issue, but it appears that the eval code may have some robustness issues (which is strange saying considering it has worked fine for months with no problem... very odd).