Implement `num_workers` on trainer to boost performance

Lordmau5 commented 1 year ago

Is your feature request related to a problem? Please describe. We currently don't use any num_workers for sub-processes as per torch documentation, resulting in warnings in the console that the validation and the training doesn't have any available.

Adding that variable to both and setting it to 4 for each results in a jump from about 1.40it/s to 2.50it/s, at least in my testing.

https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

num_workers (int, optional) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)

Describe the solution you'd like Implement support for num_workers, perhaps as extra fields for both the validator and the trainer in the config.

Additional context There is currently one downside: Doing so will, at least with the current pytorch version, throw multiple warnings in the console (and logs) mentioning the wrong usage of TypedStorage and how it's deprecated.

I have also noticed that with this change it is staying at a steady VRAM usage (with batch size 16 it stays at 14.8GB, with 20 it stays at around 16.8GB).

It drops to a lower usage whenever it's saving the checkpoints though, but that's fine.

I assume this is also part of where the improvements come from since it doesn't have to constantly reload through the main thread / process and can load the new stuff in the background.

34j commented 1 year ago

@allcontributors add Lordmau5 ideas, maintenance, question, userTesting

allcontributors[bot] commented 1 year ago

@34j

I've put up a pull request to add @Lordmau5! :tada:

voicepaw / so-vits-svc-fork

Implement `num_workers` on trainer to boost performance #309