FRRouting / topotests

Moved to frrouting/frr
22 stars 26 forks source link

lib: speed up router shutdown #48

Closed louberger closed 6 years ago

louberger commented 6 years ago
 stopRouter: report when a process is being killed
         only sleep if actually killed a process
     add option to not sleep or conduct -7 kill
 topogen stop: add wait parameter (defaults to old/wait)
               when stopping routers, first stop all without waiting
       then do a second pass where will wait if needed

Signed-off-by: Lou Berger lberger@labn.net

rzalamena commented 6 years ago

Hi Lou,

I found out the same thing you did, but I solved it differently: the router check is implemented incorrectly, after the daemon is killed no pid files are being removed, so I just removed the pid files and then checked if they exist. It is still not optimal, but I think that if we are going to the correct way we might need something more sophisticated.

Checkout my fix: https://github.com/opensourcerouting/topotests/commit/fed5b2bbdf6fc57ff98abfe4d9e49e6f7a2641ed

Your fix seems to add one more button to the stop method and I'm trying to figure out if it will be ever used (or just called with the default).

Please give me some time to think about this and I'll merge or create a PR accordingly.

louberger commented 6 years ago

My objective was to remove the sleep between routers and speed things up in general.  Note that care must be taken when removing the pid files only after the process is really dead as there are times when a simple kill is insufficient.  BTW my changes cut out about 25secs on what was a 92sec run!

Lou

On 12/7/2017 2:03 PM, Rafael Zalamena wrote:

Hi Lou,

I found out the same thing you did, but I solved it differently: the router check is implemented incorrectly, after the daemon is killed no pid files are being removed, so I just removed the pid files and then checked if they exist. It is still not optimal, but I think that if we are going to the correct way we might need something more sophisticated.

Checkout my fix: opensourcerouting/topotests@fed5b2b https://github.com/opensourcerouting/topotests/commit/fed5b2bbdf6fc57ff98abfe4d9e49e6f7a2641ed

Your fix seems to add one more button to the |stop| method and I'm trying to figure out if it will be ever used (or just called with the default).

Please give me some time to think about this and I'll merge or create a PR accordingly.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/FRRouting/topotests/pull/48#issuecomment-350063016, or mute the thread https://github.com/notifications/unsubscribe-auth/AGRWyMR0xTSEJJmaSBIIrPOwOkl-uKOQks5s-DZ5gaJpZM4Q6BQF.

NetDEF-CI commented 6 years ago

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-TOPOPR-62/

This is a comment from an EXPERIMENTAL automated CI system. For questions and feedback in regards to this CI system, please feel free to email Martin Winter - mwinter (at) opensourcerouting.org.

louberger commented 6 years ago

@rzalamena I think you'll find that my change is a bit zippier as it skips any sleeps in the normal case -- but will sleep and do the kill -7 when needed. Also you are doing a remove of the pid file without checking that the process is really dead

rzalamena commented 6 years ago

Thanks Lou,

Your fix definitely makes topotest teardowns faster. There are some other minor changes that I want to make, but I think your PR should go in so we can already start benefiting from it.

louberger commented 6 years ago

Okay, I'm happy to review your pr on top of mine...

On December 8, 2017 8:39:26 AM Rafael Zalamena notifications@github.com wrote:

Thanks Lou,

Your fix definitely makes topotest teardowns faster. There are some other minor changes that I want to make, but I think your PR should go in so we can already start benefiting from it.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/FRRouting/topotests/pull/48#issuecomment-350265190