Open yzx2337 opened 1 week ago
I am not 100% sure if I understand your question, but we use the following experiment runner: https://github.com/mjasny/distexprunner. The repo describes how to run experiments with it.
I am not 100% sure if I understand your question, but we use the following experiment runner: https://github.com/mjasny/distexprunner. The repo describes how to run experiments with it.
Thanks for your answer! Yes, I've seen the example of distexprunner, but I'm still a little confused. This is the code of /distexprunner/experiments/config.py in your repo. I'm not sure what these fields mean in your /distexprunner/experiments/config.py (e.g. the second column, 'c08.lab.dm.informatik.tu-darmstadt.de') and would love your answer.
server_list = ServerList(
# fill in
Server('node08', 'c08.lab.dm.informatik.tu-darmstadt.de', SERVER_PORT, ibIp='172.18.94.80', sibIP='172.18.94.81', ssdPath="/dev/md0"),
Server('node07', 'c07.lab.dm.informatik.tu-darmstadt.de', SERVER_PORT, ibIp='172.18.94.70', sibIP='172.18.94.71', ssdPath="/dev/md0 "),
Server('node06', 'c06.lab.dm.informatik.tu-darmstadt.de', SERVER_PORT, ibIp='172.18.94.60', sibIP='172.18.94.61', ssdPath="/dev/md0"),
Server('node04', 'c04.lab.dm.informatik.tu-darmstadt.de', SERVER_PORT, ibIp='172.18.94.40', sibIP='172.18.94.41', ssdPath="/dev/md127"),
Server('node05', 'c05.lab.dm.informatik.tu-darmstadt.de', SERVER_PORT, ibIp='172.18.94.50', sibIP='172.18.94.51', ssdPath="/dev/md0"),
Server('node02', 'c02.lab.dm.informatik.tu-darmstadt.de', SERVER_PORT, ibIp='172.18.94.20', sibIP='172.18.94.21', ssdPath="/dev/md0"),
Server('node01', 'c01.lab.dm.informatik.tu-darmstadt.de', SERVER_PORT, ibIp='172.18.94.10', sibIP='172.18.94.11', ssdPath="/dev/md0"),
)
And what I want to express is that it's different from the /distexprunner/examples/config.py (see below).
from distexprunner import ServerList, Server
SERVER_PORT = 20000
server_list = ServerList(
Server('node01', '127.0.0.1', SERVER_PORT),
Server('node02', '127.0.0.1', SERVER_PORT),
Server('node03', '127.0.0.1', SERVER_PORT),
Server('node04', '127.0.0.1', SERVER_PORT),
Server('node05', '127.0.0.1', SERVER_PORT),
#Server('node0x', '192.168.94.2x', SERVER_PORT),
)
The Server Class is extensible, that is what I did in my code to have more information in the experiments script. e.g. infiniband ip (ibIP
), a second infinband interface (sibIP
), and the ssd path.
If I recall correctly that is mostly legacy code for other experiments, and you can ignore the extra fields (but better double check in the experiment scripts).
Thank you very much for your open source! Now I want to run the alignment experiment with the code you provided, but the class 'Server' descibed in the alignment.py seems to be different from the definition of class 'Server' under folder 'distexprunner', but I didn't find any other files about the 'Server' class. So, if I want to run alignment experiments, do I need to modify the implementation of the ’Server‘ class myself? Hope to get your reply soon, thanks again for your paper and code!