RENCI-NRIG / orca5

ORCA5 Software
Eclipse Public License 1.0
2 stars 1 forks source link

Modify Slice: General Disscussion #13

Closed paul-ruth closed 9 years ago

paul-ruth commented 9 years ago

This "issue" is for discussion of the modify slice development

YufengXin commented 9 years ago

Sorry for my bad memory, I forgot to update the mod.rdf out of my VM before I sent to you from my laptop.

Now please try the attached new mod.rdf. The mod-ori.rdf is the same.

ibaldin commented 9 years ago

NPE(s) when trying to delete a node.

INFO | jvm 1 | 2015/09/18 14:35:50 | java.lang.NullPointerException INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.embed.cloudembed.controller.ModifyHandler.removeElement(ModifyHandler.java:569) INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.embed.cloudembed.controller.ModifyHandler.modifySlice(ModifyHandler.java:127) INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.embed.workflow.RequestWorkflow.modify(RequestWorkflow.java:232) INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.controllers.xmlrpc.OrcaXmlrpcHandler.modifySlice(OrcaXmlrpcHandler.java:566)

INFO | jvm 1 | 2015/09/18 14:35:50 | java.lang.NullPointerException INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.embed.cloudembed.controller.ModifyHandler.createManifest(ModifyHandler.java:627) INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.embed.cloudembed.controller.ModifyHandler.modifySlice(ModifyHandler.java:145) INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.embed.workflow.RequestWorkflow.modify(RequestWorkflow.java:232) INFO | jvm 1 | 2015/09/18 14:35:50 | at orca.controllers.xmlrpc.OrcaXmlrpcHandler.modifySlice(OrcaXmlrpcHandler.java:566)

ibaldin commented 9 years ago

I still see the NPE with above stack trace, using tag a453dcffa92e7e262ee11820901c968ab9ef5e10 which includes tag 996e9ad60ae0858191b7dea6baf638a6b7ffe4ab

YufengXin commented 9 years ago

my bad, the push didn't go through.

Just pushed in, please try again.

ibaldin commented 9 years ago

That helped. In addition to the problem we discussed (deleted nodes reappearing after add), there is this IndexOutOfBounds exception in the controller I noticed while looking through the logs:

INFO | jvm 1 | 2015/09/21 14:28:58 | java.lang.ArrayIndexOutOfBoundsException: 1 INFO | jvm 1 | 2015/09/21 14:28:58 | at orca.controllers.xmlrpc.OrcaXmlrpcHandler.modifySlice(OrcaXmlrpcHandler.java:698) INFO | jvm 1 | 2015/09/21 14:28:58 | at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) INFO | jvm 1 | 2015/09/21 14:28:58 | at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

ibaldin commented 9 years ago

I'm also seeing several of these (always in threes). Remains to be seen what is causing them, since I didn't catch them when I was doing modifies:

2015-09-21 15:06:55,476 [qtp981488976-29] ERROR controller.orca.controllers.xmlrpc.SliceDeferThread - Exception, failed to demand reservationjava.lang.Exception: Could not demand resources: Status: code=-17200 (See OrcaConstants for details) java.lang.Exception: Could not demand resources: Status: code=-17200 (See OrcaConstants for details) at orca.controllers.xmlrpc.SliceDeferThread.demandSlice(SliceDeferThread.java:231) at orca.controllers.xmlrpc.SliceDeferThread.processSlice(SliceDeferThread.java:158) at orca.controllers.xmlrpc.OrcaXmlrpcHandler.modifySlice(OrcaXmlrpcHandler.java:817) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.xmlrpc.server.ReflectiveXmlRpcHandler.invoke(ReflectiveXmlRpcHandler.java:115) at org.apache.xmlrpc.server.ReflectiveXmlRpcHandler.execute(ReflectiveXmlRpcHandler.java:106) at org.apache.xmlrpc.server.XmlRpcServerWorker.execute(XmlRpcServerWorker.java:46) at org.apache.xmlrpc.server.XmlRpcServer.execute(XmlRpcServer.java:86) at org.apache.xmlrpc.server.XmlRpcStreamServer.execute(XmlRpcStreamServer.java:200) at org.apache.xmlrpc.webserver.XmlRpcServletServer.execute(XmlRpcServletServer.java:112) at org.apache.xmlrpc.webserver.XmlRpcServlet.doPost(XmlRpcServlet.java:196) at orca.controllers.OrcaXmlrpcServlet.doPost(OrcaXmlrpcServlet.java:151) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:527) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:423) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:493) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:926) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:358) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:860) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) at org.eclipse.jetty.server.Server.handle(Server.java:335) at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:588) at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:1046) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:764) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:217) at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:418) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:476) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) at java.lang.Thread.run(Thread.java:745)

ibaldin commented 9 years ago

I think they match up with these on the SM:

2015-09-21 14:37:23,334-[qtp559803229-19]-{ERROR}-orca.test-test-topology-embedding-SM-(ActorManagementObject.java:604)-closeReservation java.lang.RuntimeException: reservation not found at orca.shirako.kernel.Kernel.error(Kernel.java:268) at orca.shirako.kernel.Kernel.validate(Kernel.java:1228) at orca.shirako.kernel.KernelWrapper.close(KernelWrapper.java:165) at orca.shirako.core.Actor.close(Actor.java:285) at orca.manage.internal.ActorManagementObject$5.run(ActorManagementObject.java:599) at orca.shirako.core.Actor$3.process(Actor.java:977) at orca.shirako.core.Actor.actorMain(Actor.java:377) at orca.shirako.core.Actor$4.run(Actor.java:1018) at java.lang.Thread.run(Thread.java:745)

YufengXin commented 9 years ago

hopefully the lated checkin fixed these problems, please resume the test.

ibaldin commented 9 years ago

Test 1: PASS

ibaldin commented 9 years ago

Test 2: PARTIAL

See exceptions in controller after adding back the node with the same name (i.e. Node2 was first removed, then re-added). Exceptions appear to be for the old links and node that were previously removed.

2015-09-24 13:53:53,937 [qtp981488976-49] DEBUG controller.orca.controllers.xmlrpc.SliceDeferThread - demandSlice(): Issuing demand for reservation: 51351bfa-8647-4478-a001-0b6a27276060 2015-09-24 13:53:53,939 [qtp981488976-49] DEBUG org.springframework.ws.client.MessageTracing.sent - Sent request [SaajSoapMessage {http://www.nicl.duke.edu/orca/manage/services/clientactor}DemandReservationRequest] 2015-09-24 13:53:53,948 [qtp981488976-49] DEBUG org.springframework.ws.client.MessageTracing.received - Received response [SaajSoapMessage {http://www.nicl.duke.edu/orca/manage/services/clientactor}DemandReservationResponse] for request [SaajSoapMe ssage {http://www.nicl.duke.edu/orca/manage/services/clientactor}DemandReservationRequest] 2015-09-24 13:53:53,949 [qtp981488976-49] ERROR controller.orca.controllers.xmlrpc.SliceDeferThread - Exception, failed to demand reservationjava.lang.Exception: Could not demand resources: Status: code=-17200 (See OrcaConstants for details) java.lang.Exception: Could not demand resources: Status: code=-17200 (See OrcaConstants for details) at orca.controllers.xmlrpc.SliceDeferThread.demandSlice(SliceDeferThread.java:231) at orca.controllers.xmlrpc.SliceDeferThread.processSlice(SliceDeferThread.java:158) at orca.controllers.xmlrpc.OrcaXmlrpcHandler.modifySlice(OrcaXmlrpcHandler.java:826) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.xmlrpc.server.ReflectiveXmlRpcHandler.invoke(ReflectiveXmlRpcHandler.java:115) at org.apache.xmlrpc.server.ReflectiveXmlRpcHandler.execute(ReflectiveXmlRpcHandler.java:106) at org.apache.xmlrpc.server.XmlRpcServerWorker.execute(XmlRpcServerWorker.java:46) at org.apache.xmlrpc.server.XmlRpcServer.execute(XmlRpcServer.java:86) at org.apache.xmlrpc.server.XmlRpcStreamServer.execute(XmlRpcStreamServer.java:200) at org.apache.xmlrpc.webserver.XmlRpcServletServer.execute(XmlRpcServletServer.java:112) at org.apache.xmlrpc.webserver.XmlRpcServlet.doPost(XmlRpcServlet.java:196) at orca.controllers.OrcaXmlrpcServlet.doPost(OrcaXmlrpcServlet.java:151) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:527) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:423) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:493) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:926) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:358) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:860) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113) at org.eclipse.jetty.server.Server.handle(Server.java:335) at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:588) at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:1046) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:764) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:217) at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:418) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:476) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436) at java.lang.Thread.run(Thread.java:745)

ibaldin commented 9 years ago

Test 1a: PASS

Test 1b: ONCE: After re-add bcast link failed to appear in manifest (12 nodes). PARTIAL on second try (5 nodes) - manifest correct, however modify was not invoked on the nodes to add interface

ibaldin commented 9 years ago

Test 3: Upon the removal of a path, modify remove interface is not being called.

YufengXin commented 9 years ago

The latest checking fixed a few things affecting the inter-domain and broadcasting modifying. Mostly url string match confusion caused by my earlier use of guid.

ibaldin commented 9 years ago

I tried test 1b (3 nodes with a broadcast link, remove link, add link again, remove link again). After adding previously deleted link instead of 3 addiface modify calls I only saw 2. Then, when I deleted the link, I only say 2 removes.

Probably best Paul tests it for real.

ibaldin commented 9 years ago

After trying from clean start this didn't happen. I think it needs more testing.

ibaldin commented 9 years ago

Doing similar test as above, except starting with a nodegroup of 3 nodes, saw again, after re-adding previously deleted broadast link, that only two addiface actions were invoked, thus leaving one of the nodes without an interface.

paul-ruth commented 9 years ago

We need to start with simpler requests.

I tried creating one node, then modifying it by adding another node with a link. This fails. Yufeng knows about.

Also, some of the more complicated modifies that involve adding links to existing node fail because the controller and handler are not agreeing on properties names.

ibaldin commented 9 years ago

ok. my starting point was a dumbbell, but this is even better. work on this, I will test inter-domain in emulation.

YufengXin commented 9 years ago

For this single node case, the exception was thrown by Flukes, a NumberFormatException, see the screenshot:

-Yufeng

On Sep 29, 2015, at 1:11 PM, paul-ruth notifications@github.com<mailto:notifications@github.com> wrote:

We need to start with simpler requests.

I tried creating one node, then modifying it by adding another node with a link. This fails. Yufeng knows about.

Also, some of the more complicated modifies that involve adding links to existing node fail because the controller and handler are not agreeing on properties names.

— Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144124127.

[cid:DCA4FD5D-8CD5-4519-B0ED-59B0D6F50459@renci.org]

ibaldin commented 9 years ago

I'll take care of it

ibaldin commented 9 years ago

This is a controller exception, Flukes just passes it through:

INFO | jvm 1 | 2015/09/29 14:39:00 | java.lang.NumberFormatException: null INFO | jvm 1 | 2015/09/29 14:39:00 | at java.lang.Integer.parseInt(Integer.java:454) INFO | jvm 1 | 2015/09/29 14:39:00 | at java.lang.Integer.valueOf(Integer.java:582) INFO | jvm 1 | 2015/09/29 14:39:00 | at orca.controllers.xmlrpc.ReservationConverter.modifyReservations(ReservationConverter.java:1643) INFO | jvm 1 | 2015/09/29 14:39:00 | at orca.controllers.xmlrpc.ReservationConverter.modifyReservations(ReservationConverter.java:1589) INFO | jvm 1 | 2015/09/29 14:39:00 | at orca.controllers.xmlrpc.OrcaXmlrpcHandler.modifySlice(OrcaXmlrpcHandler.java:625) INFO | jvm 1 | 2015/09/29 14:39:00 | at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) INFO | jvm 1 | 2015/09/29 14:39:00 | at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) INFO | jvm 1 | 2015/09/29 14:39:00 | at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

YufengXin commented 9 years ago

yes. fixed

YufengXin commented 9 years ago

And also revised the property names according to Paul's new convention.

Please check out and test

paul-ruth commented 9 years ago

It looks like I'm not getting any properties having to do with the modify addiface. Are you sure they are being added?

YufengXin commented 9 years ago

Could you check with Pequad? I saw them in Pequad in emulator. I am coming to office shortly.

Yufeng

Sent from my Verizon Wireless 4G LTE smartphone

-------- Original message -------- From: paul-ruth notifications@github.com Date: 09/30/2015 9:32 AM (GMT-05:00) To: RENCI-NRIG/orca5 orca5@noreply.github.com Cc: Yufeng Xin yxin@renci.org Subject: Re: [orca5] Modify Slice: General Disscussion (#13)

It looks like I'm not getting any properties having to do with the modify addiface. Are you sure they are being added?

Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144398186.

paul-ruth commented 9 years ago

I used to print all the properties that are passed to the handler. None of the properties that are passed are related to the the new interface.

paul-ruth commented 9 years ago

I used ""....

paul-ruth commented 9 years ago

It keeps interpreting the command I used as an instruction to be interpreted by the issue...

I used an xml command that includes the text "echoproperties" in my handler to print the all the properties.

ibaldin commented 9 years ago

I think it is worth looking at pequod, because this will show all properties available on the reservation, not just the ones sent to the handler. It is possible that the properties are there but of the wrong type.

paul-ruth commented 9 years ago

Then that sounds like a great debugging tool for Yufeng to use to figure out why the properties are not being setup correctly.

YufengXin commented 9 years ago

I’ve been using it.

could you try modifying the dumbbell? The fix probably was not enough for the single node case, I’m looking into it.

-Yufeng

On Sep 30, 2015, at 10:28 AM, paul-ruth notifications@github.com<mailto:notifications@github.com> wrote:

Then that sounds like a great debugging tool for Yufeng to use to figure out why the properties are not being setup correctly.

— Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144429717.

YufengXin commented 9 years ago

I fixed another NPE for single node, seems Java implementation of String operations is still not 100%. I didn’t see it last night. Give it a try then.

I just tried the single node case, adding another node to it, and the property list in Pequad contains the modifying properties:

CONFIG: config.ssh.user1.login = root modify.1.interface = 1 unit.ec2.instance.type = xo.medium unit.domain = http://geni-orca.renci.org/owl/76654e75-6d89-4696-8bff-602a803e5abb#Node0 element.GUID = 21f85058-3186-48bb-b94d-03bf5187a13c unit.url = http://geni-orca.renci.org/owl/76654e75-6d89-4696-8bff-602a803e5abb#Node0 xmlrpc.user.dn = EMAILADDRESS=yxin@renci.orgmailto:EMAILADDRESS=yxin@renci.org, CN=yufengxin, OU=Yufeng Xin, O=BEN@RENCI, L=Chapel Hill, ST=NC, C=US config.ssh.numlogins = 1 config.end_time = 86394 config.duration = 86400 modify.subcommand.1 = modify.addiface modify.1.vlan.tag = 2 unit.slice.name = mod1 local.isVM = 1 config.image.url = http://geni-images.renci.org/images/standard/centos/centos6.3-neuca-v1.0.7.xml unit.hostname.url = http://geni-orca.renci.org/owl/76654e75-6d89-4696-8bff-602a803e5abb#Node0 unit.hostname = Node0 config.image.guid = ea7a22549aee7ae8a164a45e663e822e77464bbb config.ssh.user1.sudo = no config.start_time = 0 config.ssh.user1.keys =

On Sep 30, 2015, at 10:33 AM, Yufeng Xin yxin@renci.org<mailto:yxin@renci.org> wrote:

I’ve been using it.

could you try modifying the dumbbell? The fix probably was not enough for the single node case, I’m looking into it.

-Yufeng

On Sep 30, 2015, at 10:28 AM, paul-ruth notifications@github.com<mailto:notifications@github.com> wrote:

Then that sounds like a great debugging tool for Yufeng to use to figure out why the properties are not being setup correctly.

— Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144429717.

paul-ruth commented 9 years ago

I was using the dumbell. I didn't think the single node example was fixed.

paul-ruth commented 9 years ago

You are right I do see a couple of modify properties:

INFO | jvm 1 | 2015/09/30 13:29:02 | [echoproperties] modify.1.interface=2 INFO | jvm 1 | 2015/09/30 13:29:02 | [echoproperties] modify.1.vlan.tag=2201 INFO | jvm 1 | 2015/09/30 13:29:02 | [echoproperties] modify.subcommand.1=modify.addiface

The rest are missing.

ibaldin commented 9 years ago

@YufengXin can you comment about Java String - what the issue is you refer to above?

YufengXin commented 9 years ago

Paul, I re-did the property names, pls checkout and give it a try.

Ilia, I was referring to the method “Interger.valueOf(str)”, when “str’ is null (in the case of single node), which seemed not throwing NPE when I tested it last night.

-Yufeng

On Sep 30, 2015, at 11:55 AM, Ilya Baldin notifications@github.com<mailto:notifications@github.com> wrote:

@YufengXinhttps://github.com/YufengXin can you comment about Java String - what the issue is you refer to above?

— Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144457290.

ibaldin commented 9 years ago

@YufengXin please do a pull - I did a commit that removes redeclarations of a number of strings and replaces them with definitions from UnitProperties class - let's avoid defining strings (like unit.eth) in the future, this way lies madness.

I believe I was careful not to break anything. You can see here 123929d7d3c4aa939b1c345e8e25e535f8a0addd

paul-ruth commented 9 years ago

That seemed to work (at least for the initial cases). I'll try some more complicated ones

YufengXin commented 9 years ago

Hi, Ilya,

The problem with the broadcasting case that you described is related to the modifying dependency thread: when the broadcasting is added, the three nodes would wait for the vlan to become active before modifying, somehow not all the three nodes wait until the vlan becomes active even though they all registered into that thread.

This is just a quick observation as I have not debugged it yet. I’ll debug it more later today.

Just let you know, maybe you have a quick thought on possible errors.

-Yufeng

On Sep 30, 2015, at 4:12 PM, paul-ruth notifications@github.com<mailto:notifications@github.com> wrote:

That seemed to work (at least for the initial cases). I'll try some more complicated ones

— Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144526606.

ibaldin commented 9 years ago

I need to look. It is not impossible that one gets lost and the thought had occurred to me. I will look at it tomorrow. Need to look at logs more to determine if three get inserted, but only two fire off the queue. I suggest you look at other things if any and leave this one to me for now.

-ilya

Sent from some sort of a mobile device equipped with an autocorrect function that has a mind of its own.

-------- Original message -------- From: YufengXin notifications@github.com Date: 09/30/2015 4:24 PM (GMT-06:00) To: RENCI-NRIG/orca5 orca5@noreply.github.com Cc: Ilya Baldin ibaldin@renci.org Subject: Re: [orca5] Modify Slice: General Disscussion (#13)

Hi, Ilya,

The problem with the broadcasting case that you described is related to the modifying dependency thread: when the broadcasting is added, the three nodes would wait for the vlan to become active before modifying, somehow not all the three nodes wait until the vlan becomes active even though they all registered into that thread.

This is just a quick observation as I have not debugged it yet. I'll debug it more later today.

Just let you know, maybe you have a quick thought on possible errors.

-Yufeng

On Sep 30, 2015, at 4:12 PM, paul-ruth notifications@github.com<mailto:notifications@github.com> wrote:

That seemed to work (at least for the initial cases). I'll try some more complicated ones

Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144526606.

Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144547080.

YufengXin commented 9 years ago

Could be the case that only one, first or last, gets lost.

Yufeng

Sent from my Verizon Wireless 4G LTE smartphone

-------- Original message -------- From: Ilya Baldin notifications@github.com Date: 09/30/2015 5:30 PM (GMT-05:00) To: RENCI-NRIG/orca5 orca5@noreply.github.com Cc: Yufeng Xin yxin@renci.org Subject: Re: [orca5] Modify Slice: General Disscussion (#13)

I need to look. It is not impossible that one gets lost and the thought had occurred to me. I will look at it tomorrow. Need to look at logs more to determine if three get inserted, but only two fire off the queue. I suggest you look at other things if any and leave this one to me for now.

-ilya

Sent from some sort of a mobile device equipped with an autocorrect function that has a mind of its own.

-------- Original message -------- From: YufengXin notifications@github.com Date: 09/30/2015 4:24 PM (GMT-06:00) To: RENCI-NRIG/orca5 orca5@noreply.github.com Cc: Ilya Baldin ibaldin@renci.org Subject: Re: [orca5] Modify Slice: General Disscussion (#13)

Hi, Ilya,

The problem with the broadcasting case that you described is related to the modifying dependency thread: when the broadcasting is added, the three nodes would wait for the vlan to become active before modifying, somehow not all the three nodes wait until the vlan becomes active even though they all registered into that thread.

This is just a quick observation as I have not debugged it yet. I'll debug it more later today.

Just let you know, maybe you have a quick thought on possible errors.

-Yufeng

On Sep 30, 2015, at 4:12 PM, paul-ruth notifications@github.com<mailto:notifications@github.com> wrote:

That seemed to work (at least for the initial cases). I'll try some more complicated ones

Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144526606.

Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144547080.

Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-144548300.

paul-ruth commented 9 years ago

Should adding/removing a storage node work? Right now it hangs on the submit to in flukes. It makes the controller get wedged and it need to be restarted.

ibaldin commented 9 years ago

I cannot replicate missing 'addinterface' operations. I also no longer see missing remove interface operations, so I think this part is fixed (unless you can tell me how to replicate missing addiface).

ibaldin commented 9 years ago

I did several consecutive remove/add broadcast link with 3 nodes, that seemed to work. Then I added a fourth node and connected it to a link (that I thought was previously created already). The resulting topology ended up looking like this

screen shot 2015-10-02 at 5 02 39 pm

There really were two network links instead of one, as confirmed in pequod. I then deleted one of the links (that was connecting 3 nodes) and got the topology of four nodes and a link (seemed properly connected).

Then I added one more node and connected it to the link, and the resulting topology had 5 nodes, 4 of which were connected and one was not (similar to what Paul described in situation when you add a node to existing link).

screen shot 2015-10-02 at 5 08 40 pm

In short - I think there is something fishy when you add a node to already existing link.

YufengXin commented 9 years ago

The problem now with modifying/adding is mostly related to the need to provide a list of attached VM IP addresses to the storage reservation, which was not handled by current modifying code.

So just want to confirm the IP address list is still needed by the current iSCSI handler and I’ll work on it.

-Yufeng

On Oct 2, 2015, at 5:08 PM, Ilya Baldin notifications@github.com<mailto:notifications@github.com> wrote:

Reopened #13https://github.com/RENCI-NRIG/orca5/issues/13.

— Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#event-425715148.

paul-ruth commented 9 years ago

Yes, the storage device needs the list of nodes that attach to it.

ibaldin commented 9 years ago

@YufengXin please take a look at issue #14 - I don't know if it makes sense to

  1. Continue doing it the same way for new slice create and modify-add-storage
  2. Fix it for modify-add-storage, and leave it alone in new slice create
  3. Fix it for both
ibaldin commented 9 years ago

After talking to Paul some more - there is a question here, whether we need to implement modify.addstorageinterface action for EC2 and XCAT handlers, or whether it is really two parts - standard modify.addinterface and modify.addstorage. We need to discuss this a little further, but I suspect the outcome doesn't affect the properties that controller needs to generate - only how many and in what order modify operations need to be invoked.

YufengXin commented 9 years ago

Just took a quick look at this, one issue is that the modifying request generated by Flukes does not contain the broadcasting link, it only says adding a VM.

-Yufeng On Oct 2, 2015, at 5:07 PM, Ilya Baldin notifications@github.com<mailto:notifications@github.com> wrote:

I did several consecutive remove/add broadcast link, that seemed to work, then I added a fourth node and connected it to a link (that I thought was previously created already). The resulting topology ended up looking like this

[screen shot 2015-10-02 at 5 02 39 pm]https://cloud.githubusercontent.com/assets/12304033/10257864/daf45068-6927-11e5-803f-9c3ac568d312.png

There really were two network links instead of one, as confirmed in pequod. I then deleted one of the links (that was connecting 3 nodes) and got the topology of four nodes and a link (seemed properly connected).

Then I added one more node and connected it to the link, and the resulting topology had 5 nodes, 4 of which were connected and one was not (similar to what Paul described in situation when you add a node to existing link).

In short - I think there is something fishy when you add a node to already existing link.

— Reply to this email directly or view it on GitHubhttps://github.com/RENCI-NRIG/orca5/issues/13#issuecomment-145155527.

ibaldin commented 9 years ago

OK flukes should be fixed here 7bd89ad6d08f06b7b8da27f7c4317f9c45de86dc (flukes repo). Latest version also deployed as well.