facebookresearch / hydra

Hydra is a framework for elegantly configuring complex applications
https://hydra.cc
MIT License
8.83k stars 637 forks source link

[Bug] #2779

Open manan-cashfree opened 1 year ago

manan-cashfree commented 1 year ago

πŸ› Bug

Description

Search path issues during instantiation.

To reproduce

Minimal Code/Config snippet to reproduce directory structure:

β”œβ”€β”€ src
β”‚   β”œβ”€β”€ data
β”‚   β”‚   β”œβ”€β”€ components
β”‚   β”‚   β”‚   β”œβ”€β”€ transforms.py
β”‚   β”‚   β”‚   └── __init__.py
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ documents_datamodule.py
β”œβ”€β”€ configs
β”‚   β”œβ”€β”€ __init__.py
β”œβ”€β”€ paths
β”‚   β”‚   └── default.yaml
β”œβ”€β”€ data
β”‚   β”‚   β”œβ”€β”€ document.yaml
# __init__.py file of components import both train and val transforms
...
from components import train_transforms, val_transforms

class DocumentsDataModule(LightningDataModule):
    def __init__(
            self,
            data_dir: str = "data/",
            train_val_test_split_ratio: tuple = (0.8, 0.2),
            batch_size: int = 8,
            sampler: str = "random",
            num_workers: int = 0,
            pin_memory: bool = False,
    ) -> None:

        super().__init__()
        self.save_hyperparameters(logger=False)
        self.data_train: Optional[ImageFolder] = None
        self.data_val: Optional[ImageFolder] = None
        self.data_test: Optional[ImageFolder] = None
        self.data_predict: Optional[ImageFolder] = None
        self.batch_size_per_device = batch_size
        self.train_transforms = train_transforms
        self.val_transforms = val_transforms
@hydra.main(version_base="1.3", config_path="../configs", config_name="train.yaml")
def view_model(cfg: DictConfig):
    datamodule: LightningDataModule = hydra.utils.instantiate(cfg.data)

Hydra config:

_target_: src.data.documents_datamodule.DocumentsDataModule
_convert_: all
data_dir: ${paths.data_dir}
batch_size: 8 # Needs to be divisible by the number of devices (e.g., if in a distributed setup)
sampler: "random" # imbalanced or random
train_val_test_split_ratio: [0.8, 0.2] # if length is 2 then only train, val splits are created
num_workers: 7 # set according to cpu cores
pin_memory: False

src and all paths have been correctly configured. No issues there. But as soon as I instantiate, there is some weird path issue. This doesn't happen otherwise.

Stack trace/error message

hydra.errors.InstantiationException: Error locating target 'src.data.documents_datamodule.DocumentsDataModule'

Expected Behavior

Imports should be correctly handled. Upon removing the import train_transforms, I am able to initialize.

System information

Additional context

Hydra sucks in a lot of stuff. Or perhaps it is too complex for me. I just tried using a pytorch lightning template and found that it doesn't handle such basic stuff.

odelalleau commented 1 year ago

Can you please make sure your directory structure is correct? For instance I don't see train_transforms.py in it. I also don't see any __init__.py under src which normally should prevent any instantiation from that folder since it's not a proper Python module.

cs-mshah commented 1 year ago

Finally fixed the issue. In the __init__.py file of components, it should be from .transforms import *. Thanks for the support.

odelalleau commented 1 year ago

Glad you found it! :)