Closed pheller closed 2 years ago
Seems the constraint may actually be in the frontend CLI:
29118 vagrant 20 0 507156 422360 5176 R 97.7 20.7 0:07.57 clixon_cli
26391 root 20 0 136224 47276 5108 S 1.7 2.3 0:03.43 clixon_backend
Some comments after analyzing. It is the CLI allocating and freeing large amounts of memory for each CLI command. This is due to the large number of YANG files or more precisely, the size of the generated auto-cli tree from YANG syntax. In one openconfig example, there are ca 100 YANG files each generating auto-cli:s. The cli does dynamical expansion of trees using the "@tree" syntax, and in this case, this means making a copy of the complete model (the auto-cli tree generated from YANG). This is a problem that needs to be addressed. Workarounds include loading syntax as XML or JSON, not individual CLI commands.
Made a rather large set of changes to address performance problems with the auto-cli for large yang configs, see commit-messages above for a detailed list. In a test-case using openconfig-network-instances.yang a 10x performance increase was measured (in time). This was partly done by reducing memory (ca 50%) but mainly by different algorithmic optimizations. The positive side was that this was a new area for optimization with several "low hanging fruit" The negative side was that several of the algorithmic changes were deep in core cligen code. All primary tests have passed, but since a lot of changes have been made, more tests need to be done, and verification input is welcome.
Ok, with the 5.4.0 changes, I've done some benchmarking.
This load merge function is essentially turning /tmp/openconfig.conf from a junos-object style notation into a bunch of set statements; the same set each time. The first execution of this prior to 5.4.0 took over 12 minutes of this, so this is a vast improvement.
However, repeating the same configuration a few times reveals a linear growth in time.
During these loads, the clixon_cli utilization is ~ 0%, while the backend holds steady near 100%.
% time clixon_cli -F load-object-script.txt
clixon> configure
clixon / # load merge /tmp/openconfig.conf
load complete
clixon / # commit
clixon / # exit
clixon> quit
real 1m30.103s
user 0m4.771s
sys 0m0.633s
% time clixon_cli -F load-object-script.txt
clixon> configure
clixon / # load merge /tmp/openconfig.conf
load complete
clixon / # commit
clixon / # exit
clixon> quit
real 5m59.499s
user 0m6.623s
sys 0m0.496s
% time clixon_cli -F load-object-script.txt
clixon> configure
clixon / # load merge /tmp/openconfig.conf
load complete
clixon / # commit
clixon / # exit
clixon> quit
real 7m50.876s
user 0m7.162s
sys 0m0.597s
% time clixon_cli -F load-object-script.txt
clixon> configure
clixon / # load merge /tmp/openconfig.conf
load complete
clixon / # commit
clixon / # exit
clixon> quit
real 8m24.685s
user 0m7.264s
sys 0m0.527s
%
OK, thanks. Strange, there could be a case that it would step up once to a higher level if you load an identical file, going from an empty db to a populated db. But then, it should not continue to increase.
Ok, the frontend performance processing is improved as described with the related commits.
I have clixon running with a number of openconfig models, namely openconfig-network-instances.
We have implemented a cli plugin method
load_set_file
that iterates a given file, invokingcliread_parse
for each set or delete statement.I believe each
cliread_parse
call results in an internaledit-config
RPC and a backend semantic validation, which can be time consuming depending on the existing configuration and yang schema.Looking for input on an optimization here; maybe a mechanism to pass multiple configuration statements to cligen and on to clixon in a single RPC call with a single semantic validation pass.
As an example: