maeri-project / mRNA

mRNA
https://synergy.ece.gatech.edu/tools/mrna
MIT License
21 stars 9 forks source link

Distribution Network is not built correctly #5

Open francisco-munoz opened 5 years ago

francisco-munoz commented 5 years ago

Dear @zzy82518996 @hyoukjun @tushar-krishna ,

I have been revising the code of the file DSNetwork.cpp. Specifically, the part in which you try to create the Distribution network, and I think the tree you are building is not correct. Let me know if I am wrong or if I have misunderstood something. I know this is complicated to understand on text, but I would really appreciate you could take a few minutes for this since if I am right, would mean the tree construction is not right in the code, and if I am wrong would mean I don't know exactly how does it work and I would like to learn it.

In the previous issue #4, @zzy82518996 told me that you are using a distribution tree with the same DSwitches per level in order to enable multicast and unicast communication efficiently. Supposing a number of Multiplier switches of 4 (N=4) (to simplify) I think this is what you are trying to build (again, let me know if I am wrong because this is just something I have come up with after studying all your papers):

image

Notice that each circle represents a DSwitch and we have log2(N)+1 levels. Am I right if I say that it is what you were trying to build?

If so, there are mistakes in the code. I have carefully read the functions DSNetwork::DSNetwork(double bw, int pe_size) and void DSNetwork::setPhysicalConnection(int levelnum, int pe_size) and I believe, this is what is being built:

image

I have verified this by means of modifying the source code and print messages with the indexes. This is, for instance, the result for such a tree of N=4: (the level and the DSwitches start to count from 0) Level 0 Connected (0, 0) --> (1, 0) and (1, 2) Connected (0, 1) --> (1, 1) and (1, 3) Connected (0, 2) --> (1, 1) and (1, 2) Connected (0, 3) --> (1, 0) and (1, 3)

(You might check as these connections fit the ones of the Figure)

I think this is what you do wrong (in the case you want to build the first tree I showed):

Again, thank you very much if you have taken some minutes to read it. I really appreciate it and I really hope this helps.

Best regards, Francisco Muñoz email: francisco.munoz2@um.es

zzy82518996 commented 5 years ago

Hi Francisco:

I think you can use the -show_maeri=1 option to see the topology of DSNetwork. The attachment is the MAERI Network with 8 MSes. The green circles represent DS. Thanks.

----- 原始邮件 ----- 发件人: "Francisco Muñoz Martinez" notifications@github.com 收件人: "georgia-tech-synergy-lab/mRNA" mRNA@noreply.github.com 抄送: "zzy82518996" zzy82158996@sjtu.edu.cn, "Mention" mention@noreply.github.com 发送时间: 星期五, 2019年 6 月 14日 下午 7:28:42 主题: [georgia-tech-synergy-lab/mRNA] Distribution Network is not built correctly (#5)

Dear @zzy82518996 @hyoukjun @tushar-krishna ,

I have been revising the code of the file DSNetwork.cpp. Specifically, the part in which you try to create the Distribution network, and I think the tree you are building is not correct. Let me know if I am wrong or if I have misunderstood something. I know this is complicated to understand on text, but I would really appreciate you could take a few minutes for this since if I am right, would mean the tree construction is not right in the code, and if I am wrong would mean I don't know exactly how does it work and I would like to learn it.

In the previous issue @zzy82518996 told me that you are using a distribution tree with the same DSwitches per level in order to enable multicast and unicast communication efficiently. Supposing a number of Multiplier switches of 4 (N=4) (to simplify) I think this is what you are trying to build (again, let me know if I am wrong because this is just something I have come up with after studying all your papers):

image

Notice that each circle represents a DSwitch and we have log2(N)+1 levels. Am I right if I say that it is what you were trying to build?

If so, there are mistakes in the code.

I have carefully read the functions DSNetwork::DSNetwork(double bw, int pe_size) and void DSNetwork::setPhysicalConnection(int levelnum, int pe_size) and I believe, this is what is being built:

image

I have verified this by means of modifying the source code and print messages with the indexes. This is, for instance, the result for such a tree of N=4:

(the level and the DSwitches start to count from 0)

Level 0

Connected (0, 0) --> (1, 0) and (1, 2)

Connected (0, 1) --> (1, 1) and (1, 3)

Connected (0, 2) --> (1, 1) and (1, 2)

Connected (0, 3) --> (1, 0) and (1, 3)

(You might check as these connections fit the ones of the Figure)

I think this is what you do wrong (in the case you want to build the first tree I showed):

Again, thank you very much if you have taken some minutes to read it. I really appreciate it and I really hope this helps.

Best regards,

Francisco Muñoz

email: francisco.munoz2@um.es

--

You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub:

https://github.com/georgia-tech-synergy-lab/mRNA/issues/5

francisco-munoz commented 5 years ago

Hello @zzy82518996

I have generated the architecture for 4 Ms which is:

salida4-1

If you put the DS nodes in order, you have exactly the architecture I previously commented. image

I think this is wrong since I don't see the way of broadcast efficiently in such network. As I said, isn't need another level in the tree? shouldn't it be an architecture like this instead?

image

Thank you very much, Francisco.

zzy82518996 commented 5 years ago

Yes, actually the concrete design of DSNetwork can be left as an open research area. One thing should be noticed that we only let each DS has only two fan-outs. Even though you can use some high fan-out DS and make DSNetwork more efficiency, but this is not the case in MAERI. The principle of DSNetwork is to use tinny DS, which is perform at most 1-2 multicast, not 1-3 or more.

----- 原始邮件 ----- 发件人: "Francisco Muñoz Martinez" notifications@github.com 收件人: "georgia-tech-synergy-lab/mRNA" mRNA@noreply.github.com 抄送: "zzy82518996" zzy82158996@sjtu.edu.cn, "Mention" mention@noreply.github.com 发送时间: 星期五, 2019年 6 月 14日 下午 11:01:57 主题: Re: [georgia-tech-synergy-lab/mRNA] Distribution Network is not built correctly (#5)

Hello @zzy82518996

I have generated the architecture for 4 Ms which is:

salida4-1

If you put the DS nodes in order, you have exactly the architecture I previously commented. image

I think this is wrong since I don't see the way of broadcast efficiently in such network. As I said, isn't need another level in the tree? shouldn't it be an architecture like this instead?

image

Thank you very much, Francisco.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/georgia-tech-synergy-lab/mRNA/issues/5#issuecomment-502143528

francisco-munoz commented 5 years ago

Dear @zzy82518996

Thank you very much for your answer. I know that the main principle is using tiny switches. However, what I am not sure about is how is it possible to support broadcast traffic efficiently with the network that is implemented. Could you help me understand this? What I see in that network (the figure I showed previously) is that it's necessary to use at least two messages for broadcast a data. This does not match the paper of MAERI..

Yours, sincerely Francisco.

tushar-krishna commented 5 years ago

Hi Francisco, Thanks for your interest in MAERI and raising an interesting set of questions. In MAERI, we try and use a Fat-Treehttps://en.wikipedia.org/wiki/Fat_tree, which is basically equivalent to a Benes Networkhttps://www.slideshare.net/imsf/benes-6265840 (thats non-blocking*).

While Benes provides non-blocking unicast traffic, it is true that it may not be the most efficient for blocking traffic and is part of our ongoing research.

I’m curious why you believe that the tree you have drawn is more efficient for broadcasts? Can you send an example?

Do note though, that the tree you have drawn has switches with one and three output ports, while our design in the mRNA codebase has two output ports everywhere for uniformity.

Also - I wanted to point out that the topology in the NOCS paper is not what we use in the MAERI design.

Thanks, Tushar

*Things become a little tricky since we have chubby links (that could be 2x or 1x) - so may not be a full benes. On Jun 14, 2019, 8:35 AM -0700, Francisco Muñoz Martinez notifications@github.com, wrote:

Dear @zzy82518996https://github.com/zzy82518996

Thank you very much for your answer. I know that the main principle is using tiny switches. However, what I am not sure about is how is it possible to support broadcast traffic efficiently with the network that is implemented. Could you help me understand this? What I see in that network (the figure I showed previously) is that it's necessary to use at least two messages for broadcast a data. This does not match the paper of MAERI..

Yours, sincerely Francisco.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/georgia-tech-synergy-lab/mRNA/issues/5?email_source=notifications&email_token=AFMJSR6CCZME2JKRFILIQE3P2O3FHA5CNFSM4HYG7GK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXXE2DI#issuecomment-502156557, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFMJSRZQKATKQV4T3IPQW5LP2O3FHANCNFSM4HYG7GKQ.

francisco-munoz commented 5 years ago

Hi @tushar-krishna,

Thank you very much for your answer. I am trying to code a cycle-accurate simulator of MAERI and that's why I want to know exactly how it works.

I believed that because even though a bene network provides non-blocking functionality for unicasts as it is explained here https://www.slideshare.net/imsf/benes-6265840, it looks that if you want to broadcast or multicast a certain data, you have to replicate the message. Don't you have to? I believe the bene network is perfect when you need to provide full non-blocking communication and the source that is going to send is not known a priori (kind of random). This way, you have to provide communications between all the paths. However, in DNNs, you know exactly which is the source and which is the destination at every moment. Why not set a simple pair-to-pair for unicasts and just use the suitable input at each moment? I don't know, I am not an expert on NoCs and I would really appreciate if you could tell me if I am right or give me some additional reference to study.

I think in the network I talked about, you would only need to send one message through the second node of the level 0 and then, it would arrive at all the outputs following the tree (links in red). In the case of want unicast messages, you just would need to send a message pair-to-pair. Probably there are more low-level details I am not taking into consideration..

Anyway, I think the use of the fat-tree in MAERI makes sense and might provide non-blocking functionality without replicating messages. I just misunderstood it when I read the code of mRNA.

Best regards, Francisco.

tushar-krishna commented 5 years ago

Hi Francisco, There can be multiple possible implementations of MAERI’s distribution tree. The “Fat-tree” view of the topology sends one message that gets replicated within the tree. You are right in that the “Benes” view of the Fat-tree (which effectively unrolls a N-wide fat switch into N smaller switches) may be equivalent for unicast, but for multicasts it may require replicating data across multiple ports during injection, which is not what we want.

I will be very interested in checking out your cycle accurate simulator once its ready.

Thanks, Tushar

On Jun 15, 2019, 6:53 AM -0400, Francisco Muñoz Martinez notifications@github.com, wrote:

Hi @tushar-krishnahttps://github.com/tushar-krishna,

Thank you very much for your answer. I am trying to code a cycle-accuracte simulator of MAERI and that's why I want to know exactly how it works.

I believed that because even though a bene network provides non-blocking functionality for unicasts as it is explained here https://www.slideshare.net/imsf/benes-6265840, it looks that if you want to broadcast or multicast a certain data, you have to replicate the message. Don't you have to?

I think in the network I talked about you just would need to send one message through the second node of the level 0 and then, it would arrive at all the outputs following the tree (links in red). In the case of wanting unicast messages you just would need to send a message pair-to-pair.

Anyway, I think the use of the fat-tree in MAERI makes sense and might provide non-blocking functionality without replicating messages. I just misunderstood it when I read the code of mRNA.

Best regards, Francisco.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/georgia-tech-synergy-lab/mRNA/issues/5?email_source=notifications&email_token=AFMJSRZD3ETM2OCFM3RY2LLP2TC3VA5CNFSM4HYG7GK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXYVT2Y#issuecomment-502356459, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFMJSR4RFVXRTAYBH5ZV2ODP2TC3VANCNFSM4HYG7GKQ.