RENCI-NRIG / orca5

ORCA5 Software
Eclipse Public License 1.0
2 stars 1 forks source link

Modify add core network (inter-domain) breaks manifest #74

Closed ibaldin closed 8 years ago

ibaldin commented 8 years ago

This works fine in intra-domain case, but breaks in inter-domain case.

Start with a single node (bound) As a second step add three more nodes with two broadcast links in between, each node bound to a different site. This should result in 3 nodes connected with 2 inter-domain links. The end manifest is broken. There is an exception in flukes

screen shot 2016-09-28 at 1 56 45 pm

The manifest ( controller-modify-manifest.txt ) ends up looking broken

screen shot 2016-09-28 at 1 56 53 pm

@anriban

ibaldin commented 8 years ago

Possibly related to #71

ibaldin commented 8 years ago

This does not happen if I use point-to-point link between nodes being added. However in this case the result should be the same as the 'broadcast link' connects only two sites at a time.

ibaldin commented 8 years ago

Apparently this is not a modify issue, this happens with new requests as well. An artifact of dealing with two-domain broadcast links.

mcevik0 commented 8 years ago

I tested this as below and I worked ok, but I wanted to verify if this is the correct case.

screen shot 2016-10-04 at 16 42 59

YufengXin commented 8 years ago

Hi, Mert,

Thanks for the testing. This test case is different as shown in my attached screenshots:

There will be two broadcast links connecting three nodes in three different sites. Because there are only two nodes in each broadcast link, it is expected to form two P2P paths actually, without going through the nlr site.

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 4, 2016, at 4:44 PM, mcevik0 notifications@github.com wrote:

I tested this as below and I worked ok, but I wanted to verify if this is the correct case.

Create a node on BBN. After the VM is active, create two nodes on TAMU and PSC. Create a broadcast link, and connect all three nodes to the broadcast list. Submit modify request. The slice is modified as below, all components active, traffic can be exchanged. https://cloud.githubusercontent.com/assets/18286650/19091689/a39ad8ac-8a51-11e6-9c69-c96927a2c84c.png — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-251507647, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5hTsKfzd_PffHa4orxVlOSnebyG0ks5qwrqkgaJpZM4KJH4U.

mcevik0 commented 8 years ago

Test Slice:

screen shot 2016-10-04 at 20 48 02

YufengXin commented 8 years ago

Thanks, Mert,

Somehow I could not duplicate this exception in my emulator. We can let Anirban to use and test it for now.

I’ll discuss this with you on Thursday.

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 4, 2016, at 8:49 PM, mcevik0 notifications@github.com wrote:

Test Slice:

Create node1 on PSC. After node1 is active, create node2 on BBN, node3 on UH. Connect node1 and node2 with broadcast link1, connect node1 and node3 with broadcast link2. Submit modify request. Exception received as below. https://cloud.githubusercontent.com/assets/18286650/19097601/ffbeca68-8a73-11e6-8907-fe455656deb6.png — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-251554583, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5l8IvAB5qPlqOk1vZLKlTIZFh5Ciks5qwvQWgaJpZM4KJH4U.

mcevik0 commented 8 years ago

Hello Yufeng, Anirban,

I added the information you sent over email to the issue below:

Message 1 from Anirban:

From: Anirban Mandal anirban@renci.org Subject: Re: [RENCI-NRIG/orca5] Modify add core network (inter-domain) breaks manifest (#74) Date: October 5, 2016 at 09:57:30 EDT To: Yufeng Xin yxin@renci.org Cc: Mert Cevik mcevik@renci.org, Ilya Baldin ibaldin@renci.org

Yufeng and Mert,

I tested my slice today morning. However, there seems to be a problem with the VMs coming up for VMs that are connected inter-domain, i.e. those between inter-domain broadcast links. There is no problem with the image. The other VMs attached to the intra-domain broadcast links come up fine. The error is the following. I am attaching the flukes snapshot of the manifest.

INFO | jvm 1 | 2016/10/05 12:37:16 | join: INFO | jvm 1 | 2016/10/05 12:37:16 | [echo] EC2 HANDLER: JOIN on 10/05/2016 12:37:16 UTC INFO | jvm 1 | 2016/10/05 12:37:16 | [echo] Cloud Type: nova-essex INFO | jvm 1 | 2016/10/05 12:37:16 | [echo] Controller did not pass image proxy properties ${config.image.url} or ${config.image.guid}, using default emi ami-00000009 INFO | jvm 1 | 2016/10/05 12:37:16 | [echo] Error encountered by image proxy, exiting ... INFO | jvm 1 | 2016/10/05 12:37:16 | [echo] join exit code: 1

Regards,

  • Anirban

screen shot 2016-10-05 at 12 51 11

Message 2 from Anirban:

From: Anirban Mandal anirban@renci.org Subject: Re: [RENCI-NRIG/orca5] Modify add core network (inter-domain) breaks manifest (#74) Date: October 5, 2016 at 11:14:50 EDT To: Mert Cevik mcevik@renci.org Cc: Yufeng Xin yxin@renci.org, Ilya Baldin ibaldin@renci.org

Mert, there is no problem with the image since that image worked on the same sites last week. It is because the image proxy properties like the imageproxy url, guid etc are not passed by the controller/SM to the AM for the VMs that are between two different domains. So the AM doesn’t know how to contact the image proxy service to get the image ami. Those properties are passed fine for the other VMs in the slice, which were hanging off the respective intra-domain broadcast links.

These are the image url and hash for the VMs that showed this error.

URL = http://geni-images.renci.org/images/pruth/SDN/Centos6.7-SDN.v0.1/Centos6.7-SDN.v0.1.xml ImageHash = 77ec2959ff3333f7f7e89be9ad4320c600aa6d77

There is not a single request file because the slice is built up programmatically with a sequence of modify actions. Thanks.

Regards,

  • Anirban

Message 3 from Anirban:

From: Anirban Mandal anirban@renci.org Subject: Re: [RENCI-NRIG/orca5] Modify add core network (inter-domain) breaks manifest (#74) Date: October 5, 2016 at 11:44:56 EDT To: Yufeng Xin yxin@renci.org Cc: Mert Cevik mcevik@renci.org, Ilya Baldin ibaldin@renci.org

The sequence is the following.

  1. One single node on PSC; submit request
  2. Add 3 VMs at three different sites (not connected to the first VM in any way) (PSC, WVN, UFL); With PSC VM as root, create two 2-point inter domain broadcast links with the two other VMs, i.e. connecting PSC-WVN and PSC-UFL with a broadcast link; Connect an additional broadcast link to each of the three VMs; submit modify request

You will see the problem at this point itself. In the original test, I also add two VMs each to each of the three intra-domain broadcast links before submitting the request in step 2.

Regards,

  • Anirban

Message from Yufeng:

From: Yufeng Xin yxin@renci.org Subject: Re: [RENCI-NRIG/orca5] Modify add core network (inter-domain) breaks manifest (#74) Date: October 5, 2016 at 11:57:11 EDT To: Anirban Mandal anirban@renci.org Cc: Mert Cevik mcevik@renci.org, Ilya Baldin ibaldin@renci.org

thanks.

I’ll refer the slice created using broadcast link this way, FCS (Fxxx crooked slice) -:)

looks like I’ll have to work on it over the weekend. for the time being, can you add the last broadcast link in a separate modify step?

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org

paul-ruth commented 8 years ago

This needs to be re-opened and tested before it is closed. This fix broke some basic functionality that wasn't tested before pushing to production.

A basic interdomain dumbell using a bcast network does not work anymore. For some reason the VMs no longer have the information about their images.

ibaldin commented 8 years ago

Mert can you roll back the controller on ExoSM to the previous version?

mcevik0 commented 8 years ago
YufengXin commented 8 years ago

added the image info in commit #2087c9a843ac60f57f099b5edf1663c2cfde002a

Hi, Mert,

Please rebuild and statefully restart the controller when you get chance. thanks.

mcevik0 commented 8 years ago
screen shot 2016-10-07 at 18 29 59
mcevik0 commented 8 years ago

I am reopening the issue.

mcevik0 commented 8 years ago

Hello Yufeng, I think it is either previous version of ORCA RPMs need to be deployed or the issue has to be fixed. I wanted to ask what I should do at this point. I tested by creating a P2P slice first which succeeded, then a MP slice which returned the manifest above.

YufengXin commented 8 years ago

Thanks,

Leave as it is for now, I will look into it tonight.

Sent from my Verizon 4G LTE smartphone

-------- Original message -------- From: mcevik0 notifications@github.com Date: 10/7/16 6:40 PM (GMT-05:00) To: RENCI-NRIG/orca5 orca5@noreply.github.com Cc: Yufeng Xin yxin@renci.org, State change state_change@noreply.github.com Subject: Re: [RENCI-NRIG/orca5] Modify add core network (inter-domain) breaks manifest (#74)

Hello Yufeng, I think it is either previous version of ORCA RPMs need to be deployed or the issue has to be fixed. I wanted to ask what I should do at this point. I tested by creating a P2P slice first which succeeded, then a MP slice which returned the manifest above.

You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252377885, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHPA5k7flGePUMgjV584h48lI4oIx3Vjks5qxspkgaJpZM4KJH4U.

ibaldin commented 8 years ago

Mert after yufeng diagnoses the issue we need to go back to the stable version unless he thinks the fix is quick. I don't want to leave the testbed in this state.

YufengXin commented 8 years ago

Mert, you can roll it back if users scream and you have time. I'll look into it more today in the emulator when I get chance.

This appears to be a manifest display issue, the slice works. Actually, if this MP slice is created with node group, even 1 node in each group, instead of the single node, it just works fine.

Ilya, when you get chance, please help take a look at the two manifest RDF files attached here: mp-manifest-old.rdf is the one showing correctly; mp-manifest.rdf shows a broken 3rd branch. The difference I can tell now is that in the incorrect one, the node (like Node0) and interface names are the same as in the request; in the correct one, a new node resource is generated with its name appended with a “/number” like “Node0/1”.

As a note, the recent rounds of problems, though very minor in their nature, were caused by two things: (1) the back and forth of making single node behaving like node group, i.e., able to add/delete or not; (2) the back and forth of converting 2-branch MP to P2P;, which apparently left something not cleaned up completely, i.e., a couple of string names for the new entities generated by the code.

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 7, 2016, at 8:45 PM, Ilya Baldin notifications@github.com wrote:

Mert after yufeng diagnoses the issue we need to go back to the stable version unless he thinks the fix is quick. I don't want to leave the testbed in this state.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252391168, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5mWCjTaaoSteRu39ogZ47O5yTdpiks5qxuekgaJpZM4KJH4U.

ibaldin commented 8 years ago

I'm driving most of the day today. I'll do my best.

YufengXin commented 8 years ago

don’t worry about it, it is not a critical problem at all. We’ll discuss it on Monday if I don’t fix it.

safe driving!

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 8, 2016, at 11:10 AM, Ilya Baldin notifications@github.com wrote:

I'm driving most of the day today. I'll do my best.

Sent from my Verizon 4G LTE smartphone

-------- Original message -------- From: YufengXin notifications@github.com Date: 10/8/16 10:42 AM (GMT-05:00) To: RENCI-NRIG/orca5 orca5@noreply.github.com Cc: Ilya Baldin ibaldin@renci.org, State change state_change@noreply.github.com Subject: Re: [RENCI-NRIG/orca5] Modify add core network (inter-domain) breaks manifest (#74)

Mert, you can roll it back if users scream and you have time. I'll look into it more today in the emulator when I get chance.

This appears to be a manifest display issue, the slice works. Actually, if this MP slice is created with node group, even 1 node in each group, instead of the single node, it just works fine.

Ilya, when you get chance, please help take a look at the two manifest RDF files attached here: mp-manifest-old.rdf is the one showing correctly; mp-manifest.rdf shows a broken 3rd branch. The difference I can tell now is that in the incorrect one, the node (like Node0) and interface names are the same as in the request; in the correct one, a new node resource is generated with its name appended with a "/number" like "Node0/1".

As a note, the recent rounds of problems, though very minor in their nature, were caused by two things: (1) the back and forth of making single node behaving like node group, i.e., able to add/delete or not; (2) the back and forth of converting 2-branch MP to P2P;, which apparently left something not cleaned up completely, i.e., a couple of string names for the new entities generated by the code.

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 7, 2016, at 8:45 PM, Ilya Baldin notifications@github.com wrote:

Mert after yufeng diagnoses the issue we need to go back to the stable version unless he thinks the fix is quick. I don't want to leave the testbed in this state.

You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252391168, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5mWCjTaaoSteRu39ogZ47O5yTdpiks5qxuekgaJpZM4KJH4U.

You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252428172, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALu-oY-bbJOMvgX5JLmlkgOoL-KRLvxXks5qx6vAgaJpZM4KJH4U. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252429583, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5ibqVR4hjuc2jgORife6rmK5KuiQks5qx7JNgaJpZM4KJH4U.

mcevik0 commented 8 years ago

Hello Ilya and Yufeng, I just rolled back to the previous version:

ibaldin commented 8 years ago

@YufengXin I don't think email attachments automatically get attached to the issue responses. Can you attach the manifests as .txt files to the issue, not in an email (it won't take other extensions)

YufengXin commented 8 years ago

I ran into three manifest scenario in the emulator when creating the same inter-rack MP w/ three branches: the right display, one branch not connected, and all three branches not connected. The three manifest RDF are attached here. Thanks

mp-manifest-right.txt mp-manifest.txt mp-manifest-wrong.txt

ibaldin commented 8 years ago

The problem may be in my path parsing code. I'm investigating.

ibaldin commented 8 years ago

It appears that VLAN0-NodeX interface is attached to all xxNet domain VLANs and that messes with the pathfinding code. Here is from mp-manifest-right.txt above:

BBN (Node0):

  <rdf:Description rdf:about="http://geni-orca.renci.org/owl/bbnNet.rdf#bbnNet/Domain/vlan/144566ed-8bc4-4cec-b076-c2e1c9a0d17f/vlan">
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/topology.owl#CrossConnect"/>
    <j.8:inDomain rdf:resource="http://geni-orca.renci.org/owl/bbnNet.rdf#bbnNet/Domain/vlan"/>
    <j.8:message>Reservation f0bdf35b-4bee-4a51-9842-77f9b81cbfb4 (Slice t1) is in state [Active,None]
</j.8:message>
    <j.8:hasReservationState rdf:resource="http://geni-orca.renci.org/owl/request.owl#Active"/>
    <j.13:hasInterface rdf:resource="http://geni-orca.renci.org/owl/4a21d7a8-7cf3-4eca-94da-0c3cfe5e1240#VLAN0-Node0"/>
    <j.13:hasInterface rdf:resource="http://geni-orca.renci.org/owl/bbnNet.rdf#BbnNet/IBM/G8052/TenGigabitEthernet/1/1/ethernet"/>
    <j.13:hasInterface rdf:resource="http://geni-orca.renci.org/owl/bbnNet.rdf#BbnNet/IBM/G8052/GigabitEthernet/1/0/ethernet"/>
    <j.6:bandwidth rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">10000000</j.6:bandwidth>
    <j.10:hasResourceType rdf:resource="http://geni-orca.renci.org/owl/domain.owl#VLAN"/>
    <j.13:hasURL>http://geni-orca.renci.org/owl/bbnNet.rdf#bbnNet/Domain/vlan/144566ed-8bc4-4cec-b076-c2e1c9a0d17f/vlan</j.13:hasURL>
    <j.1:inRequestNetworkConnection rdf:resource="http://geni-orca.renci.org/owl/4a21d7a8-7cf3-4eca-94da-0c3cfe5e1240#VLAN0"/>
    <j.13:inConnection rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</j.13:inConnection>
    <rdfs:label>2601</rdfs:label>
  </rdf:Description>

FIU

  <rdf:Description rdf:about="http://geni-orca.renci.org/owl/4a21d7a8-7cf3-4eca-94da-0c3cfe5e1240#Node1">
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/compute.owl#ComputeElement"/>
    <j.8:inDomain rdf:resource="http://geni-orca.renci.org/owl/fiuvmsite.rdf#fiuvmsite/Domain/vm"/>
    <j.8:message>Reservation 072c3fe7-8f1f-4e60-ac0d-bab99bb72ec1 (Slice t1) is in state [Active,None]
</j.8:message>
    <j.8:hasReservationState rdf:resource="http://geni-orca.renci.org/owl/request.owl#Active"/>
    <j.13:hasInterface rdf:resource="http://geni-orca.renci.org/owl/4a21d7a8-7cf3-4eca-94da-0c3cfe5e1240#VLAN0-Node1"/>
    <j.13:hasGUID>2f897072-91ee-4b65-91d5-a18a0dab2ea2</j.13:hasGUID>
    <j.1:diskImage rdf:resource="http://geni-orca.renci.org/owl/4a21d7a8-7cf3-4eca-94da-0c3cfe5e1240#Centos+6.7+v1.0.0"/>
    <j.1:specificCE rdf:resource="http://geni-orca.renci.org/owl/exogeni.owl#XOMedium"/>
    <j.10:hasResourceType rdf:resource="http://geni-orca.renci.org/owl/compute.owl#VM"/>
    <j.13:hasURL>http://geni-orca.renci.org/owl/fiuvmsite.rdf#fiuvmsite/Domain/vm</j.13:hasURL>
    <j.4:hasParent rdf:resource="http://geni-orca.renci.org/owl/fiuNet.rdf#fiuNet/Domain/vlan/Node1"/>
    <j.10:hasService rdf:resource="http://geni-orca.renci.org/owl/4a21d7a8-7cf3-4eca-94da-0c3cfe5e1240#Node1/Service"/>
  </rdf:Description>

It appears the issue is random because of the order of parsing the NDL. If my code picks up the VLAN0-NodeX interface as outgoing from the Net crossconnect, the pathfinding stops and the full path is not displayed.

YufengXin commented 8 years ago

Simply creating a new interface with different url won’t work, because the node(s) would still carry the original interfaces from the request. So a new node resource w/ different url has to be created.

If Flukes links a node and its upstream xNet vlan first, is it possible and simple for you to mark that interface being used already, so that it won’t be used by the Broadcast link node?

I am looking into the safer way to create the node(s).

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 10, 2016, at 4:11 PM, Ilya Baldin notifications@github.com wrote:

It appears that VLAN0-NodeX interface is attached to all xxNet domain VLANs and that messes with the pathfinding code. Here is from mp-manifest-right.txt above:

BBN (Node0):

j.8:messageReservation f0bdf35b-4bee-4a51-9842-77f9b81cbfb4 (Slice t1) is in state [Active,None] /j.8:message 10000000/j.6:bandwidth j.13:hasURLhttp://geni-orca.renci.org/owl/bbnNet.rdf#bbnNet/Domain/vlan/144566ed-8bc4-4cec-b076-c2e1c9a0d17f/vlan/j.13:hasURL true/j.13:inConnection rdfs:label2601/rdfs:label /rdf:Description FIU j.8:messageReservation 072c3fe7-8f1f-4e60-ac0d-bab99bb72ec1 (Slice t1) is in state [Active,None] /j.8:message j.13:hasGUID2f897072-91ee-4b65-91d5-a18a0dab2ea2/j.13:hasGUID j.13:hasURLhttp://geni-orca.renci.org/owl/fiuvmsite.rdf#fiuvmsite/Domain/vm/j.13:hasURL /rdf:Description It appears the issue is random because of the order of parsing the NDL. If my code picks up the VLAN0-NodeX interface as outgoing from the Net crossconnect, the pathfinding stops and the full path is not displayed. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252733210, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5qY5WwyuwLgdBYLJIr_rVCw_Na6zks5qypv0gaJpZM4KJH4U.
ibaldin commented 8 years ago

This is what you want me to deal with. Just documenting it (doesn't show link connection objects)

slide1

ibaldin commented 8 years ago

To answer the question the linking of the node to site net is a completely separate process from finding a path from site net to VLAN0 root. In fact pathfinding code doesn't even see Node (above).

One possibility is to filter out the orange interface (usually named VLAN0-Node) by the presence of a MAC address statement on it. All other interfaces between site net, ion, vlan0 and potentially BEN wouldn't have that at least for now.

Still looking at RDF model.

ibaldin commented 8 years ago

The other alternative is to change the code to look for multiple paths between site net and VLAN0 and pick the longest.

YufengXin commented 8 years ago

Sounds too complicated and too much work. Let me see what I can do then.

Thanks.

-Yufeng

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 11, 2016, at 9:59 AM, Ilya Baldin notifications@github.com wrote:

The other alternative is to change the code to look for multiple paths between site net and VLAN0 and pick the longest.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252924946, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5i0IAgujArHDqp53-IWlb-G1B17Gks5qy5ZWgaJpZM4KJH4U.

ibaldin commented 8 years ago

The problem with stitchport is that now the VLAN0 element has a statement that it hasInterface of the interface belonging to the link connection leading to the stitching domain (bypassing BEN domain). The interface is

http://geni-orca.renci.org/owl/ben-6509.rdf#Renci/Cisco/6509/GigabitEthernet/1/4/ethernet/Hydroshare/1496

The VLAN0 definition (partial):

  <rdf:Description rdf:about="http://geni-orca.renci.org/owl/71bd1e87-6b90-439d-b774-49fa400c5ea8#VLAN0">
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/topology.owl#BroadcastConnection"/>
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/topology.owl#NetworkConnection"/>
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/topology.owl#Device"/>
    <j.8:inDomain rdf:resource="http://geni-orca.renci.org/owl/nlr.rdf#nlr/Domain/vlan"/>
    <j.8:message>Reservation ba53c576-8619-48c1-bb14-911c9bf11de3 (Slice t1) is in state [Active,None]
</j.8:message>
    <j.8:hasReservationState rdf:resource="http://geni-orca.renci.org/owl/request.owl#Active"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/71bd1e87-6b90-439d-b774-49fa400c5ea8#VLAN0-Node0"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/71bd1e87-6b90-439d-b774-49fa400c5ea8#VLAN0-Node2"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/71bd1e87-6b90-439d-b774-49fa400c5ea8#VLAN0-Node1"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/ben-6509.rdf#Renci/Cisco/6509/GigabitEthernet/1/4/ethernet/Hydroshare/1496"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/nlr.rdf#NLR/DD/Juniper/QFX3500/xe/0/0/1/ethernet"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/nlr.rdf#NLR/DD/Juniper/QFX3500/xe/0/0/0/fiber"/>

Then there is a link connection that has that same interface

  <rdf:Description rdf:about="http://geni-orca.renci.org/owl/ben-6509.rdf#Renci/Cisco/6509/GigabitEthernet/1/4/ethernet/Hydroshare/1496-5e89ad01-0769-4ce9-a582-fc00b87eb23f/Stitching/Domain/intf/01b4e524-c0bc-4f5e-97c5-3190cf23c39e">
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/topology.owl#LinkConnection"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/ben-6509.rdf#Renci/Cisco/6509/GigabitEthernet/1/4/ethernet/Hydroshare/1496"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/orca.rdf#5e89ad01-0769-4ce9-a582-fc00b87eb23f/Stitching/Domain/intf"/>
  </rdf:Description>

then the stitching domain itself

  <rdf:Description rdf:about="http://geni-orca.renci.org/owl/orca.rdf#5e89ad01-0769-4ce9-a582-fc00b87eb23f/Stitching/Domain">
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/topology.owl#Device"/>
    <j.8:inDomain rdf:resource="http://geni-orca.renci.org/owl/orca.rdf#5e89ad01-0769-4ce9-a582-fc00b87eb23f/Stitching/Domain"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/orca.rdf#5e89ad01-0769-4ce9-a582-fc00b87eb23f/Stitching/Domain/intf"/>
  </rdf:Description>

BEN domain also has the same interface but because of the random order in which interfaces are picked, may sometimes be skipped and left isolated in the manifest visualization:

  <rdf:Description rdf:about="http://geni-orca.renci.org/owl/ben.rdf#ben/Domain/vlan/786b8fdf-ef6c-4a5d-90be-3d0044045c2a/vlan">
    <rdf:type rdf:resource="http://geni-orca.renci.org/owl/topology.owl#CrossConnect"/>
    <j.8:inDomain rdf:resource="http://geni-orca.renci.org/owl/ben.rdf#ben/Domain/vlan"/>
    <j.8:message>Reservation 0e774a5a-19ef-458e-a217-7f7c70dd7c50 (Slice t1) is in state [Active,None]
</j.8:message>
    <j.8:hasReservationState rdf:resource="http://geni-orca.renci.org/owl/request.owl#Active"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/ben-6509.rdf#Renci/Cisco/6509/GigabitEthernet/1/4/ethernet/Hydroshare/1496"/>
    <j.14:hasInterface rdf:resource="http://geni-orca.renci.org/owl/ben-6509.rdf#Renci/Cisco/6509/TenGigabitEthernet/3/2/fiber"/>
    <j.6:bandwidth rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">10000000</j.6:bandwidth>
    <j.11:hasResourceType rdf:resource="http://geni-orca.renci.org/owl/domain.owl#VLAN"/>
    <j.14:hasURL>http://geni-orca.renci.org/owl/ben.rdf#ben/Domain/vlan/786b8fdf-ef6c-4a5d-90be-3d0044045c2a/vlan</j.14:hasURL>
    <j.1:inRequestNetworkConnection rdf:resource="http://geni-orca.renci.org/owl/71bd1e87-6b90-439d-b774-49fa400c5ea8#VLAN0"/>
    <j.14:inConnection rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</j.14:inConnection>
    <rdfs:label>3800</rdfs:label>
  </rdf:Description>
ibaldin commented 8 years ago

slide1

ibaldin commented 8 years ago

I can see if I can isolate this interface by virtue of it being on a stitchport.

YufengXin commented 8 years ago

Same reason, it is the stitching interface from the request, which is used in the manifest.

Sent from my Verizon 4G LTE smartphone

-------- Original message -------- From: Ilya Baldin notifications@github.com Date: 10/11/16 1:24 PM (GMT-05:00) To: RENCI-NRIG/orca5 orca5@noreply.github.com Cc: Yufeng Xin yxin@renci.org, Mention mention@noreply.github.com Subject: Re: [RENCI-NRIG/orca5] Modify add core network (inter-domain) breaks manifest (#74)

[slide1]https://cloud.githubusercontent.com/assets/12304033/19280981/0128f3b4-8fb6-11e6-9d93-ccce942c2527.png

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252985005, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHPA5p2f0QCBmwuU6QBKv6tOG-QYqg5fks5qy8ZCgaJpZM4KJH4U.

ibaldin commented 8 years ago

Actually the picture is more like below and a simple test for attachment to stitchport breaks pathfinding for the stitchport completely, because the code doesn't know how to exit from the stitchport (the only interface out is ignored)

slide1

ibaldin commented 8 years ago

OK, I have a version that properly displays the manifests you've given me so far.

ibaldin commented 8 years ago

I can't guarantee this won't create some artifacts for RSpec conversion though - it uses a different method of linking things together and may produce incorrect RSpec manifests. I'd need to try it.

ibaldin commented 8 years ago

The NDL conversion doesn't really work for multipoint connections, but I don't recall if it ever has (Paul @paul-ruth can you remember?). Point to point seems fine. No exception is thrown in any of the cases, just the RSpec doesn't look complete.

ibaldin commented 8 years ago

Mert will update the RPMs on ExoSM and RCI. I've uploaded a new version of flukes to

http://geni-images.renci.org/webstart/0.7-SNAPSHOT/flukes.jnlp

Not that this is not the stock version and you need to wget/curl this jnlp file to get the properly functioning Flukes @anriban

paul-ruth commented 8 years ago

I think it used to work. I think I remember showing Niky. I don’t really use GENI/rspec that often so I don’t really know for sure.

Paul

On Oct 11, 2016, at 2:06 PM, Ilya Baldin notifications@github.com<mailto:notifications@github.com> wrote:

The NDL conversion doesn't really work for multipoint connections, but I don't recall if it ever has (Paul @paul-ruthhttps://github.com/paul-ruth can you remember?). Point to point seems fine. No exception is thrown in any of the cases, just the RSpec doesn't look complete.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-252996716, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALvALFWMAQQQAAl8xI0c6YR6pdHwk-nQks5qy9ARgaJpZM4KJH4U.

ibaldin commented 8 years ago

Well, I will have to look at it then. I need to figure out how RSpec supports multi-point connections - it didn’t used to but maybe my memory is failing. It’s been at least a year since I’ve looked at NDL-RSpec converter at any depth.

mcevik0 commented 8 years ago
anriban commented 8 years ago

I had the same issue when I tried to create the SC demo slice. The VMs connecting two domains with a broadcast link were missing their postboot scripts, i.e. neuca-user-data was showing the [scripts] section empty. This prevents configuration of the sdx switches and hence the edge VMs can’t communicate with edge VMs in other domains. The other VMs that are hanging off intra-domain broadcast links get postboot scripts.

Regards,

On Oct 12, 2016, at 12:09 PM, mcevik0 notifications@github.com<mailto:notifications@github.com> wrote:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-253259291, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJ-0nkgvSi8jxqCxgdrqLO6c776nPuVmks5qzQYogaJpZM4KJH4U.

YufengXin commented 8 years ago

I forgot to add it to the conversion, please give it a try now.

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Oct 12, 2016, at 12:26 PM, anriban notifications@github.com wrote:

I had the same issue when I tried to create the SC demo slice. The VMs connecting two domains with a broadcast link were missing their postboot scripts, i.e. neuca-user-data was showing the [scripts] section empty. This prevents configuration of the sdx switches and hence the edge VMs can’t communicate with edge VMs in other domains. The other VMs that are hanging off intra-domain broadcast links get postboot scripts.

Regards,

  • Anirban

On Oct 12, 2016, at 12:09 PM, mcevik0 notifications@github.com<mailto:notifications@github.com> wrote:

  • PostBootScripts are not processed when slices with inter-domain (P2P) broadcast-links are created. (This actually works when inter-domain slices without broadcast-links are created).
  • PostBootScript information exist in the manifests returned, however they not submitted into the VM.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-253259291, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJ-0nkgvSi8jxqCxgdrqLO6c776nPuVmks5qzQYogaJpZM4KJH4U.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/74#issuecomment-253263984, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5pkVYf8UNlH_v9oykBe0Y2117Vkxks5qzQobgaJpZM4KJH4U.

mcevik0 commented 8 years ago

Redeployed ORCA on exo-sm, tested, postbootscripts are processed.

mcevik0 commented 8 years ago

Some problems exist with MP slices. I created three MP slices connecting three racks with broadcast link. Two of them failed. Attached below, one of the manifests returned for a failure. I'm not sure what caused the failure, I'll check and and try to find out the condition.

screen shot 2016-10-13 at 14 27 34
ibaldin commented 8 years ago

Are you using a new version of Flukes or stock?

ibaldin commented 8 years ago

You should be using this

http://geni-images.renci.org/webstart/0.7-SNAPSHOT/flukes.jnlp

mcevik0 commented 8 years ago

I switched to the new version. Sorry, I forgot. I'm creating slices and did not have any problems.

ibaldin commented 8 years ago

Can we close this?