RENCI-NRIG / orca5

ORCA5 Software
Eclipse Public License 1.0
2 stars 1 forks source link

Exception when creating "Broadcast Links" #68

Closed mcevik0 closed 7 years ago

mcevik0 commented 8 years ago

Exceptions received for broadcast links. screen shot 2016-08-30 at 10 36 05

On geni2.renci.org:/etc/orca/am+broker-12080/logs/orca-stdout.log

INFO   | jvm 1    | 2016/08/30 10:29:42 | Used label in mapper:{}
INFO   | jvm 1    | 2016/08/30 10:29:43 | Candidate path length=2
INFO   | jvm 1    | 2016/08/30 10:29:43 | Intf list size<2:start=http://geni-orca.renci.org/owl/nlr.rdf#NLR/DD/Juniper/QFX3500/DD;next=http://geni-orca.renci.org/owl/nlr.rdf#nlr/Domain/vlan/up/2
INFO   | jvm 1    | 2016/08/30 10:29:43 | Candidate path length=3
INFO   | jvm 1    | 2016/08/30 10:29:43 | Intf list size<2:start=http://geni-orca.renci.org/owl/nlr.rdf#NLR/DD/Juniper/QFX3500/DD;next=http://geni-orca.renci.org/owl/nlr.rdf#nlr/Domain/vlan/up/1
INFO   | jvm 1    | 2016/08/30 10:29:43 | Error:2 (Path does NOT exist for:null.
INFO   | jvm 1    | 2016/08/30 10:29:43 |  Please see https://geni-orca.renci.org/trac/wiki/orca-errors for possible solutions.)
INFO   | jvm 1    | 2016/08/30 10:29:43 | java.lang.RuntimeException: Missing connection
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.embed.cloudembed.NetworkHandler.getConnectionTeardownActions(NetworkHandler.java:346)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.embed.cloudembed.MultiPointNetworkHandler.runEmbedding(MultiPointNetworkHandler.java:129)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.embed.cloudembed.NetworkHandler.handleRequest(NetworkHandler.java:107)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.plugins.ben.control.BenNdlControl.handleRequest(BenNdlControl.java:164)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.plugins.ben.control.NdlMPControl.formResourceSet(NdlMPControl.java:87)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.plugins.ben.control.NdlMPControl.assign(NdlMPControl.java:73)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.policy.core.AuthorityCalendarPolicy.assign(AuthorityCalendarPolicy.java:502)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.policy.core.AuthorityCalendarPolicy.map(AuthorityCalendarPolicy.java:472)
INFO   | jvm 1    | 2016/08/30 10:29:43 |       at orca.policy.core.AuthorityCalendarPolicy.mapGrowing(AuthorityCalendarPolicy.java:455)

@ibaldin @mcevik0

YufengXin commented 8 years ago

HI, Mert,

I pushed in the fix, actually rolled back the previous change for stitchingport on BEN, which appeared problematic.

So please using the new code to redeploy the Exo controller. I've tested, in my emulator, multiple times of creation and deletion of p2p and MP inter-rack connections.

I'll fix the BEN stitchingport later.

Thanks.

ibaldin commented 8 years ago

?BEN stichports were working and I think are used quite actively now...

YufengXin commented 8 years ago

Yes, the BEN stitchingport works well, but it breaks the general inter-rack MP (when releasing), because I had to modify the way the sub-request is generated in the controller to BEN and DD.

The work I do with FREEDM center will not use the testbed for a while as we’re working on a new topic and is in the theory and simulation.

Are there other users on BEN stitchingport?

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Sep 19, 2016, at 9:41 AM, Ilya Baldin notifications@github.com wrote:

?BEN stichports were working and I think are used quite actively now... — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-247996028, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5vc71d8okkLTzMYeiCT3aIEsRAlqks5qrpEDgaJpZM4Jwt-E.

ibaldin commented 8 years ago

?Fan uses it for integrating RADII with Hydroshare.

YufengXin commented 8 years ago

ok, let’s redeploy anyway to resume the correct operation of ExoSM.

I’ll try to fix the stitching later this week...

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Sep 19, 2016, at 10:04 AM, Ilya Baldin notifications@github.com wrote:

?Fan uses it for integrating RADII with Hydroshare.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248002221, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5uF4RW4-twijP9FNR2U-ykk8FCHnks5qrpZqgaJpZM4Jwt-E.

mcevik0 commented 8 years ago

Hello Ilya, Yufeng,

I understand that RPMs on geni.renci.org should be updated and a clean-restart is needed. Is that correct? If that is the case, I did not announce such a maintenance. Should I wait for this, or do it anyway maybe with a stateful restart?

YufengXin commented 8 years ago

Ilya, it’s your call.

If nobody complains about the testbed now, we can hold the redeployment of the controller after I fix the problem correctly.

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Sep 19, 2016, at 10:15 AM, mcevik0 notifications@github.com wrote:

Hello Ilya, Yufeng,

I understand that RPMs on geni.renci.org should be updated and a clean-restart is needed. Is that correct? If that is the case, I did not announce such a maintenance. Should I wait for this, or do it anyway maybe with a stateful restart?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248005347, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5jxM8Pe2erttfDm4kp2pagcGYFkHks5qrpkAgaJpZM4Jwt-E.

ibaldin commented 8 years ago

I was only anticipating ION maintenance today.

mcevik0 commented 8 years ago

To clarify my previous comment:

I planned the maintenance to update (RPMs, RDFs) geni2.renci.org and geni-ben.renci.org for ION, BEN, NLR actors. And I made such an announcement that inter-rack slices should be deleted and intra-rack slices would not be affected. However, if exo-sm is to be updated, then slices created though exo-sm will be affected.

As to the users, I heard from Fan last week that he was waiting for this issue to be fixed for his project.

ibaldin commented 8 years ago

Which issue are you referring to? Yufeng is talking about breaking the BEN stitchports that Fan is using?

ibaldin commented 8 years ago

I guess it is issue 68, sorry I thought we were on the BEN/ION 100G upgrade issue.

At any rate, I don’t want to fix one by breaking the other. It is important we re-establish connectivity via AL2S new port today because we lose our connectivity tomorrow.

I don’t want to change 5 things at once and then try to figure out which one of them broke something that worked previously. Today is about restoring ION connectivity.

-ilya

mcevik0 commented 8 years ago

The slice he wants to create is defined at the very beginning of this issue (there is a screen shot). I also attached the logs that I found on geni2.renci.org after he reported the problem.

mcevik0 commented 8 years ago

OK, I go ahead with ION.

ibaldin commented 8 years ago

controller updates can be painlessly done statefully without disturbing existing slices. There is no connection to today's more intrusive ION update.

Yufeng please instruct Mert on the ION update issue, not this one as to which actors need to be restarted today with updated RDFs to re-establish our connectivity to AL2S.

YufengXin commented 8 years ago

Ok, so please just proceed with what you planned.

We'll do a stateful restart of the Exo controller later after my next check in.

Thanks

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Sep 19, 2016, at 10:28 AM, mcevik0 notifications@github.com wrote:

To clarify my previous comment:

I planned the maintenance to update (RPMs, RDFs) geni2.renci.org and geni-ben.renci.org for ION, BEN, NLR actors. And I made such an announcement that inter-rack slices should be deleted and intra-rack slices would not be affected. However, if exo-sm is to be updated, then slices created though exo-sm will be affected.

As to the users, I heard from Fan last week that he was waiting for this issue to be fixed for his project.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248009261, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5vk3c5hWzGoVV7wvQQKQtXTA5SQGks5qrpwjgaJpZM4Jwt-E.

YufengXin commented 8 years ago

Ok, ION update related only for today. Mert had it right:

"I planned the maintenance to update (RPMs, RDFs) geni2.renci.org http://geni2.renci.org/ and geni-ben.renci.org http://geni-ben.renci.org/ for ION, BEN, NLR actors."

Thanks

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Sep 19, 2016, at 10:34 AM, Ilya Baldin notifications@github.com wrote:

controller updates can be painlessly done statefully without disturbing existing slices. There is no connection to today's more intrusive ION update.

Yufeng please instruct Mert on the ION update issue, not this one as to which actors need to be restarted today with updated RDFs to re-establish our connectivity to AL2S.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248010784, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5l-CEWE0GS4VG8GJFz920MwvUmA4ks5qrp1rgaJpZM4Jwt-E.

YufengXin commented 8 years ago

Ok, so please just proceed with what you planned.

We'll do a stateful restart of the Exo controller later after my next check in.

Thanks

Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org

On Sep 19, 2016, at 10:28 AM, mcevik0 <notifications@github.com mailto:notifications@github.com> wrote:

To clarify my previous comment:

I planned the maintenance to update (RPMs, RDFs) geni2.renci.org http://geni2.renci.org/ and geni-ben.renci.org http://geni-ben.renci.org/ for ION, BEN, NLR actors. And I made such an announcement that inter-rack slices should be deleted and intra-rack slices would not be affected. However, if exo-sm is to be updated, then slices created though exo-sm will be affected.

As to the users, I heard from Fan last week that he was waiting for this issue to be fixed for his project.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248009261, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5vk3c5hWzGoVV7wvQQKQtXTA5SQGks5qrpwjgaJpZM4Jwt-E.

ibaldin commented 8 years ago

Good enough.

ibaldin commented 8 years ago

Resurfaced during migration to 100G for BEN RENCI-NRIG/exogeni#52 and was solved by re-enabling the label sync thread. Watching the performance, closing for now.

ibaldin commented 7 years ago

Reopening this issue - Mert saw this exception when there were TWO active connections to RCI followed by a multi-point connection. Needs further investigation.

YufengXin commented 7 years ago

The latest commit on fixing #67 and the NPE when deleting a slice into RCI may fix this. I tried in emulator with 2 active P2P slices into RCI, then a MP slice with RCI, there is no Exception.

Mert, please rebuild the RPM w/ the latest code and redeploy DD and BEN.

There are a couple of commits in fixing issues in the controller. So it's good time to restart the controller altogether, including issue #69 #70

mcevik0 commented 7 years ago

ORCA rebuit. geni.renci.org, geni2.renci.org, geni-ben.renci org updated with latest build.

ibaldin commented 7 years ago

Appears fixed.