Support using IBM DDL to distribute training to multiple machines with multiple GPUs.
Reason
This would allow for much faster Neural Architecture Search. On Nimbix, for example, you could train with 12 machines with 4 P100 GPUs on each (48 GPUs total).
Solution
Support the DDL framework and enable AutoKeras to work across all machines provided.
Alternative Solutions
N/A
Additional Context
I'd be glad to help out with implementing DDL support or providing access to DDL/PowerAI machines.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Feature Description
Support using IBM DDL to distribute training to multiple machines with multiple GPUs.
Reason
This would allow for much faster Neural Architecture Search. On Nimbix, for example, you could train with 12 machines with 4 P100 GPUs on each (48 GPUs total).
Solution
Support the DDL framework and enable AutoKeras to work across all machines provided.
Alternative Solutions
N/A
Additional Context
I'd be glad to help out with implementing DDL support or providing access to DDL/PowerAI machines.