Closed mkalcok closed 11 months ago
Thanks for the reviews @fnordahl @rjarry.
I don't see anything about the metadata agent here nor any southbound access to simulate what the ML2 driver does. Is there any plan to do so?
That's a good point. I focused on the NB because I assumed that that would be the primary database the Neutron would communicate with. I'll take a look at the SB as well. Do you have any pointers to relevant code/docs about this?
There are a two parts to consider:
1) the neutron server access to the southbound db: https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/scheduler/l3_ovn_scheduler.py
2) the metadata agent that runs on all compute nodes: https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/agent/ovn/metadata/agent.py
I am far from being an expert in this code but I know it has significant impact on scale because it multiplies the number of southbound db clients by two (1 ovn-controller + 1 neutron metadata agent per compute node).
For performance reasons, openstack deployments usually recommend enabling ovn-controller monitor_all=true
to reduce cpu pressure on the southbound ovsdb-server. The same logic is also applied for neutron metadata agent. To properly do scale testing, it would be good for ovn-heater to simulate these connections and subscriptions.
There are a two parts to consider:
1. the neutron server access to the southbound db: https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/scheduler/l3_ovn_scheduler.py 2. the metadata agent that runs on all compute nodes: https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/agent/ovn/metadata/agent.py
I am far from being an expert in this code but I know it has significant impact on scale because it multiplies the number of southbound db clients by two (1 ovn-controller + 1 neutron metadata agent per compute node).
For performance reasons, openstack deployments usually recommend enabling ovn-controller
monitor_all=true
to reduce cpu pressure on the southbound ovsdb-server. The same logic is also applied for neutron metadata agent. To properly do scale testing, it would be good for ovn-heater to simulate these connections and subscriptions.
It seems to me we're only interested in how these extra read-only clients increase load on the Southbound. If that's the case, we can simulate DB clients by running ovsdb-server relay instances. Today we already start real SB relays if configured (e.g., cluster.n_relays > 0
). We can programmatically start more of these (without connecting ovn-controllers to them) and achieve the same type of load on the SB as with neutron agents. IIRC @felixhuettner had also suggested this during one of the first ovn-heater community meetings.
It seems to me we're only interested in how these extra read-only clients increase load on the Southbound. If that's the case, we can simulate DB clients by running ovsdb-server relay instances. Today we already start real SB relays if configured (e.g.,
cluster.n_relays > 0
). We can programmatically start more of these (without connecting ovn-controllers to them) and achieve the same type of load on the SB as with neutron agents. IIRC @felixhuettner had also suggested this during one of the first ovn-heater community meetings.
I think that should give us a fairly accurate measurement of the neutron-ovn-metadata-agent. it currently does not set monitor conditions, so we get a full db dump anyway. The only thing we are loosing is the performance difference between a relay and the neutron agent when receiving updates from the ovsdb; but i don't think that should matter too much
@dceara: I think the neutron metadata agent is not read-only, it also writes to the southbound db:
https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/agent/ovn/metadata/agent.py#L193-L196
https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/agent/ovn/metadata/agent.py#L292-L293
@dceara: I think the neutron metadata agent is not read-only, it also writes to the southbound db:
https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/agent/ovn/metadata/agent.py#L193-L196
https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/agent/ovn/metadata/agent.py#L292-L293
Yep, there is also some regular heartbeat, but most people configure it
It seems to me we're only interested in how these extra read-only clients increase load on the Southbound. If that's the case, we can simulate DB clients by running ovsdb-server relay instances. Today we already start real SB relays if configured (e.g.,
cluster.n_relays > 0
). We can programmatically start more of these (without connecting ovn-controllers to them) and achieve the same type of load on the SB as with neutron agents. IIRC @felixhuettner had also suggested this during one of the first ovn-heater community meetings.I think that should give us a fairly accurate measurement of the neutron-ovn-metadata-agent. it currently does not set monitor conditions, so we get a full db dump anyway. The only thing we are loosing is the performance difference between a relay and the neutron agent when receiving updates from the ovsdb; but i don't think that should matter too much
Allthough it seems like there is now some kind of monitor conditions going on here, but only for chassis https://opendev.org/openstack/neutron/src/tag/21.1.2/neutron/agent/ovn/metadata/ovsdb.py#L50-L53
@felixhuettner I decoupled external network creation from project creation and now the BaseOpenstack
test creates only one ext_net and connects all projects to it.
I'm creating these suggested changes as separate commits for easier review. Please let me know if that's OK or if you'd like me to squash them in the end to 53339f9
@felixhuettner I decoupled external network creation from project creation and now the
BaseOpenstack
test creates only one ext_net and connects all projects to it.I'm creating these suggested changes as separate commits for easier review. Please let me know if that's OK or if you'd like me to squash them in the end to 53339f9
Thanks. Looks good
I'd say that this change is now almost ready for full review. Two things that are missing and I'd like your opinion on are:
I'd say that this change is now almost ready for full review. Two things that are missing and I'd like your opinion on are:
* Floating IPs and external connectivity: I'd be happy to do this in separate PR to not blow up scope of this one even more. * ovn-metadata-agents: Is the consensus that we can simulate metadata agents with ovn-relay nodes to create "read" load on SB DB? Since as far as I can tell the only write operations of metadata agents are keepalive messages and updates to chassis external ID's on registration?
From my perspective I think it would be valuable to finalize this PR as-is now and mark it as ready for review once you feel comfortable with it.
The items discussed above can be worked on independently, and the work you have done so far would be a dependency.
@felixhuettner I decoupled external network creation from project creation and now the
BaseOpenstack
test creates only one ext_net and connects all projects to it.I'm creating these suggested changes as separate commits for easier review. Please let me know if that's OK or if you'd like me to squash them in the end to 53339f9
I'm not sure what has happend, but we seem to create multiple external networks again (even though they now seem to have no external connectivity)?
:facepalm: thank you for noticing @felixhuettner . I lost a commit that decoupled project/ext net creation when I was rebasing/force pushing local copy on a different machine
Lost change was recovered (thanks @fnordahl) and I think this PR is ready for review.
I pushed two small additions:
.cirrus.yml
to run low-scale Openstack test as part of the gateI'm working on a rebase after #180 merged.
Rebased and ready for review now @dceara @felixhuettner @rjarry
@booxter @dceara for when you have a spare moment, I believe all review feedback has been addressed.
@mkalcok @fnordahl There are a few conflicts due to merging the initial ovn-ic work; I'll try to help with the rebase.
I tried a rebase here (~completely untested yet~ barely tested): https://github.com/dceara/ovn-heater-1/commits/openstack-cloud-setup-rebase-tmp
I tried a rebase here (~completely untested yet~ barely tested): https://github.com/dceara/ovn-heater-1/commits/openstack-cloud-setup-rebase-tmp
Thanks, I'll have a look
@dceara thanks a lot for the help with the rebase. I put your proposals for adaption into the correct commits and attributed accordingly.
I redid the parts of the commits that mainly move code around to ensure no changes were lost.
I also arrived at a slightly different solution to calling the prepare_test
method, hope you agree.
Looking forward to your review!
I just noticed that the rebase is one commit shorter than the previous iteration, and the reason is that I have unintentionally squashed "CMS Openstack: Add ability to simulate VMs" and "OS Test: Spawn VMs and run ping in base scenario".
However, the result does look fine to me as both commits kind of addressed the same thing in iterations, so perhaps we should just keep it this way? Thoughts @mkalcok ?
yeah, I'm fine with them being squashed. I separated them for easier review as one added logic to the CMS Openstack
implementation and the other used that implementation in a test scenario, but they are closely related.
Please review commit by commit. This branch is based on #175 so the first 7 commits by @fnordahl can be skipped in review.
This change builds on separation of CMS implementations in #175 and it adds Openstack CMS. More info is in commit messages but at current stage it can simulate provisioning of: