NVIDIA / modulus-sym

Framework providing pythonic APIs, algorithms and utilities to be used with Modulus core to physics inform model training as well as higher level abstraction for domain experts
https://developer.nvidia.com/modulus
Apache License 2.0
187 stars 72 forks source link

🐛[BUG]: Google Colab & Hydra: MissingConfigException: Primary config directory not found. Check that the config directory '/tmp/ipykernel_3309' exists and readable #194

Open josealvarez97 opened 1 month ago

josealvarez97 commented 1 month ago

Version

1.5.0

On which installation method(s) does this occur?

No response

Describe the issue

The following code used to work on Google Colab for months, then it just stopped working. Not just for me, but for some colleagues as well. Did Google Colab (or the authors of Hydra or Nvidia Modulus) release an update that broke something?

It doesn't make sense that the library searches for /tmp/ipykernel_3309. That's not the current working directory, I already checked that in multiple ways. The default CWD in Google Colab is /content (always).

import modulus.sym
from modulus.sym.hydra import to_yaml
from modulus.sym.hydra.utils import compose
from modulus.sym.hydra.config import ModulusConfig

cfg = compose(config_path="./", config_name="config")
cfg.network_dir = "outputs" # Set the network directory for checkpoints
print(to_yaml(cfg))
WARNING:modulus.sym.hydra.config:TorchScript default is being turned off due to PyTorch version mismatch.
---------------------------------------------------------------------------
MissingConfigException                    Traceback (most recent call last)
/tmp/ipykernel_3309/3114539013.py in <cell line: 6>()
      4 from modulus.sym.hydra.config import ModulusConfig
      5 
----> 6 cfg = compose(config_path="./", config_name="config")
      7 cfg.network_dir = "outputs" # Set the network directory for checkpoints
      8 print(to_yaml(cfg))

6 frames
/usr/local/lib/python3.10/dist-packages/hydra/_internal/config_loader_impl.py in _missing_config_error(self, config_name, msg, with_search_path)
    100                 return msg
    101 
--> 102         raise MissingConfigException(
    103             missing_cfg_file=config_name, message=add_search_path()
    104         )

MissingConfigException: Primary config directory not found.
Check that the config directory '/tmp/ipykernel_3309' exists and readable

As I mentioned, I already checked the CWD on Google Colab and that's indeed /content.

For example,

!pwd

outputs /content

and

import os
print(os.getcwd())

outputs /content

I would be happy to just set an absolute path and keep working, but Hydra doesn't allow that).

cfg = compose(config_path="/content/", config_name="config")
---------------------------------------------------------------------------
HydraException                            Traceback (most recent call last)
/tmp/ipykernel_3309/3091310842.py in <cell line: 6>()
      4 from modulus.sym.hydra.config import ModulusConfig
      5 
----> 6 cfg = compose(config_path="/content/", config_name="config")
      7 cfg.network_dir = "outputs" # Set the network directory for checkpoints
      8 print(to_yaml(cfg))

1 frames
/usr/local/lib/python3.10/dist-packages/hydra/initialize.py in __init__(self, config_path, job_name, caller_stack_depth, version_base)
     80 
     81         if config_path is not None and os.path.isabs(config_path):
---> 82             raise HydraException("config_path in initialize() must be relative")
     83         calling_file, calling_module = detect_calling_file_or_module_from_stack_frame(
     84             caller_stack_depth + 1

HydraException: config_path in initialize() must be relative

The following Modulus Guide for Jupyter Notebooks is therefore no longer applicable: https://docs.nvidia.com/deeplearning/modulus/modulus-sym-v110/notebook.nbconvert.html. There needs to be a fix or workaround to keep using the library in the way they recommend when working on Google Colab.

Minimum reproducible example

Here is a Google Colab Notebook that can serve as an example. It's a Gist that used to work (I created it months ago), but as of September 2024, it doesn't work anymore if one runs it normally because of this issue: https://colab.research.google.com/gist/josealvarez97/28c4a8ab0e6dbccd4cb5e9b27641858d/nvidia-modulus-navier-stokes-tutorial-physics-informed-neural-networks.ipynb

Relevant log output

No response

Environment details

No response

Other/Misc.

Has anyone come across this issue recently? Any workarounds? Is there actually a way to force Hydra to take an absolute path in the compose function? Any Google Colab experts who understand why Hydra could be pointint to the /tmp folder and not the current working directory (which is /content)?

pantaprince commented 1 month ago

Had the same issue while using version 1.7.0, but it works fine with the 1.6.0 in google colab.

Can install modulus sym as: !pip install nvidia-modulus nvidia-modulus-sym==1.6.0

wxu0102 commented 4 weeks ago

Had the same issue while using version 1.7.0, but it works fine with the 1.6.0 in google colab.

Can install modulus sym as: !pip install nvidia-modulus nvidia-modulus-sym==1.6.0

Thank you very much, I have been troubled by this problem for a few days.

josealvarez97 commented 4 weeks ago

Had the same issue while using version 1.7.0, but it works fine with the 1.6.0 in google colab.

Can install modulus sym as: !pip install nvidia-modulus nvidia-modulus-sym==1.6.0

Yeah, I forgot to reply. Thanks for the insight @pantaprince!