UCSD-E4E / acoustic-multiclass-training

Data processing and training pipeline for classifying bird species by sound
GNU General Public License v3.0
7 stars 2 forks source link

Torch dataloader workers not inheriting sweep config #129

Closed benjamin-cates closed 1 year ago

benjamin-cates commented 1 year ago

Through working with the chunking sweep, I have noticed the dataloaders have the wrong config arguments when jobs is not set to zero. This is because when it spawns a new job it regenerates the config variable and does not inherit variables from the wandb sweep. For example, changing chunk_length_s in the wandb sweep does not work because the dataloader workers use the variables from config.yml, not from the sweep.

Proposed solution is pasted below:

def worker_init(_) -> None:
    """ Set config on torch data loader """
    torch.multiprocessing.set_sharing_strategy("file_system")
    wandb.init()
    for key, val in dict(wandb.config).items():
        setattr(cfg, key, val)

...

val_dataloader = DataLoader(
    val_dataset,
    cfg.validation_batch_size,
    shuffle=False,
    num_workers=cfg.jobs,
    worker_init_fn=worker_init
)

The problem with this is that wandb.init() prints out the whole "logged in as..." statement and it will do this every time we start a new epoch.

Any other suggestions, like somehow passing the config into the dataloaders (not sure if it's possible), would be appreciated.

benjamin-cates commented 1 year ago

Currently affected config variables:

sample_rate
n_mels
n_fft
chunk_length_s
max_offset

Changing these in a sweep when jobs!=0 will not actually vary these variables 😭

Sean1572 commented 1 year ago

https://medium.com/analytics-vidhya/how-to-create-a-thread-safe-singleton-class-in-python-822e1170a7f6

Locking when generating the singleton class may fix the issue