For each architecture, there should be a set of files (the size of which corresponds to the number of nodes) for:
1 node
2 nodes
5 nodes
10 nodes
There are no performance constraints on this script, it can take as long as it needs. It does need to be able to run to process Deep1B on the machines available to us, meaning that it probably shouldn't load all vectors into memory.
For each architecture, there should be a set of files (the size of which corresponds to the number of nodes) for:
There are no performance constraints on this script, it can take as long as it needs. It does need to be able to run to process Deep1B on the machines available to us, meaning that it probably shouldn't load all vectors into memory.