facebookresearch / hydra

Hydra is a framework for elegantly configuring complex applications
https://hydra.cc
MIT License
8.84k stars 637 forks source link

[Bug] Google Colab: MissingConfigException: Primary config directory not found. Check that the config directory '/tmp/ipykernel_3309' exists and readable #2961

Open josealvarez97 opened 1 month ago

josealvarez97 commented 1 month ago

All the context is from the question I just posted on Stack Overflow: https://stackoverflow.com/questions/79035403/colab-hydra-missingconfigexception-primary-config-directory-not-found-check

The following code used to work on Google Colab for months, then it just stopped working. Not just for me, but for some colleagues as well. Did Google Colab (or the authors of Hydra or Nvidia Modulus) release an update that broke something?

It doesn't make sense that the library searches for /tmp/ipykernel_3309. That's not the current working directory, I already checked that in multiple ways. The default CWD in Google Colab is /content (always).

import modulus.sym
from modulus.sym.hydra import to_yaml
from modulus.sym.hydra.utils import compose
from modulus.sym.hydra.config import ModulusConfig

cfg = compose(config_path="./", config_name="config")
cfg.network_dir = "outputs" # Set the network directory for checkpoints
print(to_yaml(cfg))
WARNING:modulus.sym.hydra.config:TorchScript default is being turned off due to PyTorch version mismatch.
---------------------------------------------------------------------------
MissingConfigException                    Traceback (most recent call last)
/tmp/ipykernel_3309/3114539013.py in <cell line: 6>()
      4 from modulus.sym.hydra.config import ModulusConfig
      5 
----> 6 cfg = compose(config_path="./", config_name="config")
      7 cfg.network_dir = "outputs" # Set the network directory for checkpoints
      8 print(to_yaml(cfg))

6 frames
/usr/local/lib/python3.10/dist-packages/hydra/_internal/config_loader_impl.py in _missing_config_error(self, config_name, msg, with_search_path)
    100                 return msg
    101 
--> 102         raise MissingConfigException(
    103             missing_cfg_file=config_name, message=add_search_path()
    104         )

MissingConfigException: Primary config directory not found.
Check that the config directory '/tmp/ipykernel_3309' exists and readable

As I mentioned, I already checked the CWD on Google Colab and that's indeed /content.

For example,

!pwd

outputs /content

and

import os
print(os.getcwd())

outputs /content

I would be happy to just set an absolute path and keep working, but Hydra doesn't allow that).

cfg = compose(config_path="/content/", config_name="config")
---------------------------------------------------------------------------
HydraException                            Traceback (most recent call last)
/tmp/ipykernel_3309/3091310842.py in <cell line: 6>()
      4 from modulus.sym.hydra.config import ModulusConfig
      5 
----> 6 cfg = compose(config_path="/content/", config_name="config")
      7 cfg.network_dir = "outputs" # Set the network directory for checkpoints
      8 print(to_yaml(cfg))

1 frames
/usr/local/lib/python3.10/dist-packages/hydra/initialize.py in __init__(self, config_path, job_name, caller_stack_depth, version_base)
     80 
     81         if config_path is not None and os.path.isabs(config_path):
---> 82             raise HydraException("config_path in initialize() must be relative")
     83         calling_file, calling_module = detect_calling_file_or_module_from_stack_frame(
     84             caller_stack_depth + 1

HydraException: config_path in initialize() must be relative

The following Modulus Guide for Jupyter Notebooks is therefore no longer applicable: https://docs.nvidia.com/deeplearning/modulus/modulus-sym-v110/notebook.nbconvert.html. There needs to be a fix or workaround to keep using the library in the way they recommend when working on Google Colab.

Here is a Google Colab Notebook that can serve as an example. It's a Gist that used to work (I created it months ago), but as of September 2024, it doesn't work anymore if one runs it normally because of this issue: https://colab.research.google.com/gist/josealvarez97/28c4a8ab0e6dbccd4cb5e9b27641858d/nvidia-modulus-navier-stokes-tutorial-physics-informed-neural-networks.ipynb

Has anyone come across this issue recently? Any workarounds? Is there actually a way to force Hydra to take an absolute path in the compose function? Any Google Colab experts who understand why Hydra could be pointint to the /tmp folder and not the current working directory (which is /content)?