tomaae / homeassistant-mikrotik_router

Mikrotik router integration for Home Assistant
Apache License 2.0
301 stars 50 forks source link

[Bug] Hang with huge route table on ROS 7.1.3 #183

Closed bdn76 closed 2 years ago

bdn76 commented 2 years ago

Describe the issue

Plugin hang on update route table with huge of entries (10000+). Work fine after:

        if self.api.connected():
            await self.hass.async_add_executor_job(self.get_system_health)

        #if self.api.connected():
        #    await self.hass.async_add_executor_job(self.get_route)

        if self.api.connected():
            await self.hass.async_add_executor_job(self.get_interface)

Consider move route table upload to config options.

How to reproduce the issue

Connect plugin to ROS 7.1.3+ with huge routing table (BGP for example).

Expected behavior

Do not sync routing table or make optional switch for route table load.

Screenshots

I'm found large CPU consumption on tik device:

image

After comment route table loading CPU utilization went down to normal.

Software versions

All fields in this sections are required.

tomaae commented 2 years ago

10000+ entries? I dont have an environment where I have something like that to test. Did it really hang, or just took long?

bdn76 commented 2 years ago

Yes.

image

Took long about 5-6 minutes.

tomaae commented 2 years ago

I'm not authorized to use BGP, so not sure how to test and workaround this issue at the moment.

bdn76 commented 2 years ago

What about option for route load like this:

image

?

tomaae commented 2 years ago

Not in this case, because of a way it is used. I may be able to find a way, since I only need to find an interface for default route. There may be more efficient way to do that.

tomaae commented 2 years ago

what will you get when you type this in cli? /ip route print terse where dst-address=0.0.0.0/0

bdn76 commented 2 years ago
[admin@router] > /ip route print terse where dst-address=0.0.0.0/0
   DAv   dst-address=0.0.0.0/0 routing-table=main pref-src="" gateway=pppoe-out1 immediate-gw=pppoe-out1 distance=10 scope=30 target-scope=10 suppress-hw-offload=no 
   D d   dst-address=0.0.0.0/0 routing-table=main pref-src="" gateway=192.168.8.1 immediate-gw=192.168.8.1%lte1 distance=20 scope=30 target-scope=10 vrf-interface=lte1 suppress-hw-offload=no 

[admin@router] > 
tomaae commented 2 years ago

Assuming lte1 is your default internet gateway, this works. Was there any noticable load on router side with your setup?

This should show only default gateway interface. /ip route print terse where dst-address=0.0.0.0/0 and vrf-interface

bdn76 commented 2 years ago

Assuming lte1 is your default internet gateway, this works. Was there any noticable load on router side with your setup?

No, default gw is pppoe (letter "A" in "DAv") - lte1 is backup (no letter "A" in "D d". CPU utilization without route table loading is around 1-2%.

This should show only default gateway interface. /ip route print terse where dst-address=0.0.0.0/0 and vrf-interface

[admin@router] > /ip route print terse where dst-address=0.0.0.0/0 and vrf-interface
   D d   dst-address=0.0.0.0/0 routing-table=main pref-src="" gateway=192.168.8.1 immediate-gw=192.168.8.1%lte1 distance=20 scope=30 target-scope=10 vrf-interface=lte1 suppress-hw-offload=no

Incorrect.

tomaae commented 2 years ago

Hmm, this is tricky, since pppoe does not show interface. I will have to research this further, since this is obviously correct way to approach gateway detection without causing high CPU load in cases such as yours. I also have close to none experience with pppoe, which is a problem as well.

tomaae commented 2 years ago

hmm, but does it detect any devices behind pppoe interface? if not, this will be no-issue.

bdn76 commented 2 years ago

This one:

[admin@router] > /interface print where name=pppoe-out1
Flags: R - RUNNING
Columns: NAME, TYPE, ACTUAL-MTU
 #   NAME        TYPE       ACTUAL-MTU
13 R pppoe-out1  pppoe-out        1492
[admin@router] > 
tomaae commented 2 years ago

what I mean is route output. but I just found this in different documentation

immediate-gw (string) | Shows actual (resolved) gateway and interface that will be used for packet forwarding. Displayed in format [ip%interface]. -- | --

while not strictly true, since ip% is not present in your case, it could be useful. I just wish I had more complex setups to get this output from so I can be sure.

tomaae commented 2 years ago

it may actually not be possible. API library still accesses whole table even when limiting selection. Not sure what to do here, I cant find any other way to find default gateway.