allenai / allenact

An open source framework for research in Embodied-AI from AI2.
https://www.allenact.org
Other
316 stars 52 forks source link

Multinode #304

Closed jordis-ai2 closed 3 years ago

jordis-ai2 commented 3 years ago

Summary:

There’s two changes needed in the machine_params functions to be able to run across multiple nodes:

  1. We need to specify all the devices through all machines (and the respective number of processes/samplers)
  2. We pass a local_workers_ids list specifying which of all devices run on each machine via a new machine_id keyword argument

When invoking main.py, we need to specify --distributed_ip_port (format xx.xx.xx.xx:portnum) and --machine_id.

In order to easily run everything without relying on managers like slurm installed I wrote a couple of scripts (under the scripts folder) to make things a bit easier:

One thing to note is that every machine will only log local stats, except for 'approx_fps', which does use the global count through all machines.

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 8059db1484e5dcf2bbb3ee957581d6fc0ab80f9e into c9485aa39ec3eec89b74984a089119bf407c84d1 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging 53267c0a6fb0b1109cbdc7d044232a1aae20a88e into e3ba00faeb6114777c8c0b3da2fd6a752fcc1a55 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging eb761de37faeebfc3477336fff2a4672b4faedc1 into e3ba00faeb6114777c8c0b3da2fd6a752fcc1a55 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging 2b3c56b079beaa395b071e3ab81ec097d2574b28 into e3ba00faeb6114777c8c0b3da2fd6a752fcc1a55 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging ed51158714409d68bb5b57090d84a063642248eb into e3ba00faeb6114777c8c0b3da2fd6a752fcc1a55 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging 31c136247b3dd68296f7fb7c504f752a550a0d28 into e3ba00faeb6114777c8c0b3da2fd6a752fcc1a55 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 9e6b2b82692f2d40cd9fef72b98435a8ef2f927c into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging cf84f29394c6e763900d0c086a1bd7f2075387a7 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 4c498c33c8f37ff3545c126e90debc1250cbf70f into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging d7030bcdcaa7eb6b39746cbb9f44205a62d8b288 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 77990692c36769e4006d997459719b7774c89820 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 3a220a6eb65269e50c5aea627220f8d92dc9f19b into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging b3beec62c6bf2e3ace2f8c5d0152e5a156db09b9 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging d6b16239e5f9b47c35375c7bd6eb787dca97d459 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 78c9a45142fe4c3cb7141648246856c1f40e0e33 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging b3effe4e5203707d8807bf496387d021b2fb5fed into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging cee2c34706cdb5f96c8299be6f5c58468c8442cd into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 23839d915f35365e668d22f1b1ef6f1146de7818 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 96b6ccdeeb1221f0e58f18ba716d1fb769a767bf into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging e31de9964bab330a3ac4d933e2a1c2b5530cca57 into 0978361d9314a7499069da092f5442f3739ceb16 - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging c9d06830070ae99fae4ebbac6e7cb7a4181b1b73 into a26b3f074658f730f4de4cd946cd417114ae387b - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts and fixes 1 when merging 376a47f3f9114bd7fd09c591ff25020dd462de56 into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 3 alerts and fixes 1 when merging 15f234bb38936cc2ac71f2bcb7da820a07760bad into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 3 alerts and fixes 1 when merging e0be4f1bb26f740e2b6493415f941aadcfe879c1 into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 3 alerts and fixes 1 when merging 555df74194311302fcbb68797ce32e47bdaab464 into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 1 alert and fixes 1 when merging ca9751a5824a1bde79552a7d59a0fb83f8a4fcad into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

new alerts:

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging 4dc0550776d794ba74007a34cb4e81cf7c8e305e into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging 2b340864bce360424024020de1a121508ada0bce into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 3 years ago

This pull request fixes 1 alert when merging 21062cfb6c17d7f372afd66bccdd6e0d1caaebc2 into af0f2cccb23e6914823cb8cf644abd9f4a50fb2f - view on LGTM.com

fixed alerts: