TorchSpatiotemporal / tsl

tsl: a PyTorch library for processing spatiotemporal data.
https://torch-spatiotemporal.readthedocs.io/
MIT License
236 stars 22 forks source link

Trouble with Hydra, perhaps other way to run the examples? #14

Closed StefanBloemheuvel closed 1 year ago

StefanBloemheuvel commented 1 year ago

Hi TSL team,

I have been playing around with your package a lot lately, and it works great! However, I do experience issues when trying to alter your code to my specific needs. For example, I would like to run a loop through several different settings (e.g., with a .yaml file). However, after the first loop is done, and a new iteration is started, I get the following error:

Could not override 'dataset.name'.
To append to your config use +dataset.name=bay
Key 'dataset' is not in struct
    full_key: dataset
    object_type=dict

I think it has something to do with the get_hydra_cli_arg function

def get_hydra_cli_arg(key: str, delete: bool = False):
    try:
        key_idx = [arg.split("=")[0] for arg in sys.argv].index(key)
        arg = sys.argv[key_idx].split("=")[1]
        if delete:
            del sys.argv[key_idx]
        return arg
    except ValueError:
        return None

which seems to remove some of the sys.argv arguments.

My question therefore is, could I prevent this behavior easily? If not, could I then circumvent using Hydra entirely? I have seen the "A Gentle Introduction to tsl" Jupyter Notebook with a different way of running it, but there I do not get all the important settings of the experiment anymore, and I want to be sure that the settings are correct.

In an ideal scenario, it would be possible to just have a script that runs from line 0 to n in sequential order (such as in the A gentle introduction to tsl notebook) but with the information of the run_traffic_experiment.py and their configs. This information is there in the notebook for the timethenspace model:

model_kwargs = {
    'input_size': dm.n_channels,  # 1 channel
    'horizon': dm.horizon,  # 12, the number of steps ahead to forecast
    'hidden_size': 16,
    'rnn_layers': 1,
    'gcn_layers': 2
}

But not for the other models, right?

I hope I explained my problems clearly, otherwise please tell me is something is not clear! If you could guide me in the right direction that would be really great!

Thanks in advance!

marshka commented 1 year ago

Hi, the function you are referring to (get_hydra_cli_arg()) is only used to access two specific args ('config_path' and 'config'), thus cannot be responsible for the error you get. I suggest you check Hydra's doc for issues about how to run experiments with yaml configs.

In principle, you can run directly the run functions of our example scripts bypassing Hydra and building a configuration dictionary with your favorite method. Unfortunately, the "gentle introduction" notebook is not yet updated to the latest version, we'll fix it soon!