In parameter server scenario, I think network delay is always an important issue. However, each BigMatrix push and pull demand a tuple of size 3. But, row indices and col indices could be merge into one indices. It' will reduce about 1/3 network consumption. Also, the row index must be Long is also a limit. (Not always Long is needed)
I agree, there is a lot of room for optimizations here. I also think range selectors would be interesting where only start and end indices are provided.
In parameter server scenario, I think network delay is always an important issue. However, each BigMatrix push and pull demand a tuple of size 3. But, row indices and col indices could be merge into one indices. It' will reduce about 1/3 network consumption. Also, the row index must be Long is also a limit. (Not always Long is needed)