radical-cybertools / radical.repex.at

This is the github location for RepEx developed by the RADICAL team in conjunction with the York Lab.
Other
4 stars 3 forks source link

TUU redundant file transfer #44

Closed haoyuanchen closed 8 years ago

haoyuanchen commented 8 years ago

In the initial stage of TUU, when it's transferring all the different starting coordinates to the remote cluster, there're some redundancies that will lead to some non-negligible overhead.

First, even if we specify "same_coordinates" = True, it still transfers the same file to the remote cluster N times (N is total number of replicas). It should just transfer it once, in principle.

Also, in TUU the total number of replicas is N(T) x N(U1) x N(U2). However, if "same_coordinates" = False, the number of total different starting coordinates is N(U1) x N(U2), because two replicas that have same U1, U2 but different T will use the same starting coordinate. In the real case, it still transfers N(T) x N(U1) x N(U2) times, which means each unique starting coordinate file got transferred N(T) times.

If we could fix those two issues the performance for TUU will be improved. A quick solution will be to zip all the coordinate files in the "amber_coordinates_folder", transfer it to the remote cluster, and unzip.

antonst commented 8 years ago

First, even if we specify "same_coordinates" = True, it still transfers the same file to the remote cluster N >times (N is total number of replicas). It should just transfer it once, in principle.

Correct, but I remember fixing this issue already. Will check if it is present tuu-opt5 branch.

Also, in TUU the total number of replicas is N(T) x N(U1) x N(U2). However, if "same_coordinates" = >False, the number of total different starting coordinates is N(U1) x N(U2), because two replicas that have >same U1, U2 but different T will use the same starting coordinate. In the real case, it still transfers N(T) x >N(U1) x N(U2) times, which means each unique starting coordinate file got transferred N(T) times.

Good point. I will fix this.

A quick solution will be to zip all the coordinate files in the "amber_coordinates_folder", transfer it to the remote cluster, and unzip.

Will do. Thanks for looking into this.

antonst commented 8 years ago

This is now fixed in devel, can you please verify?

antonst commented 8 years ago

is this issue still relevant?

haoyuanchen commented 8 years ago

This issue is not relevant now. Thanks!