Closed mcevik0 closed 7 years ago
HI, Mert,
I pushed in the fix, actually rolled back the previous change for stitchingport on BEN, which appeared problematic.
So please using the new code to redeploy the Exo controller. I've tested, in my emulator, multiple times of creation and deletion of p2p and MP inter-rack connections.
I'll fix the BEN stitchingport later.
Thanks.
?BEN stichports were working and I think are used quite actively now...
Yes, the BEN stitchingport works well, but it breaks the general inter-rack MP (when releasing), because I had to modify the way the sub-request is generated in the controller to BEN and DD.
The work I do with FREEDM center will not use the testbed for a while as we’re working on a new topic and is in the theory and simulation.
Are there other users on BEN stitchingport?
Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org
On Sep 19, 2016, at 9:41 AM, Ilya Baldin notifications@github.com wrote:
?BEN stichports were working and I think are used quite actively now... — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-247996028, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5vc71d8okkLTzMYeiCT3aIEsRAlqks5qrpEDgaJpZM4Jwt-E.
?Fan uses it for integrating RADII with Hydroshare.
ok, let’s redeploy anyway to resume the correct operation of ExoSM.
I’ll try to fix the stitching later this week...
Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org
On Sep 19, 2016, at 10:04 AM, Ilya Baldin notifications@github.com wrote:
?Fan uses it for integrating RADII with Hydroshare.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248002221, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5uF4RW4-twijP9FNR2U-ykk8FCHnks5qrpZqgaJpZM4Jwt-E.
Hello Ilya, Yufeng,
I understand that RPMs on geni.renci.org should be updated and a clean-restart is needed. Is that correct? If that is the case, I did not announce such a maintenance. Should I wait for this, or do it anyway maybe with a stateful restart?
Ilya, it’s your call.
If nobody complains about the testbed now, we can hold the redeployment of the controller after I fix the problem correctly.
Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org
On Sep 19, 2016, at 10:15 AM, mcevik0 notifications@github.com wrote:
Hello Ilya, Yufeng,
I understand that RPMs on geni.renci.org should be updated and a clean-restart is needed. Is that correct? If that is the case, I did not announce such a maintenance. Should I wait for this, or do it anyway maybe with a stateful restart?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248005347, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5jxM8Pe2erttfDm4kp2pagcGYFkHks5qrpkAgaJpZM4Jwt-E.
I was only anticipating ION maintenance today.
To clarify my previous comment:
I planned the maintenance to update (RPMs, RDFs) geni2.renci.org and geni-ben.renci.org for ION, BEN, NLR actors. And I made such an announcement that inter-rack slices should be deleted and intra-rack slices would not be affected. However, if exo-sm is to be updated, then slices created though exo-sm will be affected.
As to the users, I heard from Fan last week that he was waiting for this issue to be fixed for his project.
Which issue are you referring to? Yufeng is talking about breaking the BEN stitchports that Fan is using?
I guess it is issue 68, sorry I thought we were on the BEN/ION 100G upgrade issue.
At any rate, I don’t want to fix one by breaking the other. It is important we re-establish connectivity via AL2S new port today because we lose our connectivity tomorrow.
I don’t want to change 5 things at once and then try to figure out which one of them broke something that worked previously. Today is about restoring ION connectivity.
-ilya
The slice he wants to create is defined at the very beginning of this issue (there is a screen shot). I also attached the logs that I found on geni2.renci.org after he reported the problem.
OK, I go ahead with ION.
controller updates can be painlessly done statefully without disturbing existing slices. There is no connection to today's more intrusive ION update.
Yufeng please instruct Mert on the ION update issue, not this one as to which actors need to be restarted today with updated RDFs to re-establish our connectivity to AL2S.
Ok, so please just proceed with what you planned.
We'll do a stateful restart of the Exo controller later after my next check in.
Thanks
Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org
On Sep 19, 2016, at 10:28 AM, mcevik0 notifications@github.com wrote:
To clarify my previous comment:
I planned the maintenance to update (RPMs, RDFs) geni2.renci.org and geni-ben.renci.org for ION, BEN, NLR actors. And I made such an announcement that inter-rack slices should be deleted and intra-rack slices would not be affected. However, if exo-sm is to be updated, then slices created though exo-sm will be affected.
As to the users, I heard from Fan last week that he was waiting for this issue to be fixed for his project.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248009261, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5vk3c5hWzGoVV7wvQQKQtXTA5SQGks5qrpwjgaJpZM4Jwt-E.
Ok, ION update related only for today. Mert had it right:
"I planned the maintenance to update (RPMs, RDFs) geni2.renci.org http://geni2.renci.org/ and geni-ben.renci.org http://geni-ben.renci.org/ for ION, BEN, NLR actors."
Thanks
Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org
On Sep 19, 2016, at 10:34 AM, Ilya Baldin notifications@github.com wrote:
controller updates can be painlessly done statefully without disturbing existing slices. There is no connection to today's more intrusive ION update.
Yufeng please instruct Mert on the ION update issue, not this one as to which actors need to be restarted today with updated RDFs to re-establish our connectivity to AL2S.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248010784, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5l-CEWE0GS4VG8GJFz920MwvUmA4ks5qrp1rgaJpZM4Jwt-E.
Ok, so please just proceed with what you planned.
We'll do a stateful restart of the Exo controller later after my next check in.
Thanks
Yufeng Xin, PhD RENCI UNC at Chapel Hill 1-919-445-9633 yxin@renci.org mailto:yxin@renci.org
On Sep 19, 2016, at 10:28 AM, mcevik0 <notifications@github.com mailto:notifications@github.com> wrote:
To clarify my previous comment:
I planned the maintenance to update (RPMs, RDFs) geni2.renci.org http://geni2.renci.org/ and geni-ben.renci.org http://geni-ben.renci.org/ for ION, BEN, NLR actors. And I made such an announcement that inter-rack slices should be deleted and intra-rack slices would not be affected. However, if exo-sm is to be updated, then slices created though exo-sm will be affected.
As to the users, I heard from Fan last week that he was waiting for this issue to be fixed for his project.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/RENCI-NRIG/orca5/issues/68#issuecomment-248009261, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPA5vk3c5hWzGoVV7wvQQKQtXTA5SQGks5qrpwjgaJpZM4Jwt-E.
Good enough.
Resurfaced during migration to 100G for BEN RENCI-NRIG/exogeni#52 and was solved by re-enabling the label sync thread. Watching the performance, closing for now.
Reopening this issue - Mert saw this exception when there were TWO active connections to RCI followed by a multi-point connection. Needs further investigation.
The latest commit on fixing #67 and the NPE when deleting a slice into RCI may fix this. I tried in emulator with 2 active P2P slices into RCI, then a MP slice with RCI, there is no Exception.
Mert, please rebuild the RPM w/ the latest code and redeploy DD and BEN.
There are a couple of commits in fixing issues in the controller. So it's good time to restart the controller altogether, including issue #69 #70
ORCA rebuit. geni.renci.org, geni2.renci.org, geni-ben.renci org updated with latest build.
Appears fixed.
Exceptions received for broadcast links.
On geni2.renci.org:/etc/orca/am+broker-12080/logs/orca-stdout.log
@ibaldin @mcevik0