Open lungdear opened 7 months ago
@armenelruth Can you please update this with more details?
Hello!
I do not consider this as a 'bug'. As we can see it is a 'feature request'.
Here we go....
As of now the vRouter sustains a single flows-table, common for all VMs on the hypervisor (common for all VRFs on the hypervisor). By default the size of this table is 512K records as soft quota plus 8K records in 'overflow' table as hard quota. That is enough for regular processing. As experience shows, the enterprise IT cloud applications, if configured correctly, produce around 30k records per vRouter. I could not recall for such figure in NFV applications clouds, but IMHO it should be within similar limits.
On the other hand, the application that configured with mistake in network stack or (even worst) works without taking care about networking under it, may produce a lot more flow records which overflow even the 'overflow' table. One of example: a ML-database that was segmented over several database-instances and data exchange was going using UDP on random source and destination ports. Each DB query was single datagram and single DB instance was producing ~800k events per second (~25Gp/s UDP random port traffic)
I do agree that such DB configuration should not happen. The application must run VxLAN/GENEVE overlay between VMs, and on the cloud network layer we can pack such noise traffic into FatFlow. But it happens. And when it happens it impacts all VMs on the hypervisor: there is no room for new record in common flows-table so there is no new connections to/from the all VMs on the hypervisor.
Having this story in mind, the proposal is the following:
vRouter should sustain individual flows-table per VRF. Each table still has a limits, soft and hard. In this case when VMs of one VRF start to produce a noisy traffic, it will overflow this per-VRF flow-table not common one, so the VMs of another VRF on the same hypervisor will not be affected, will be able to create new connections.
What is our plan here? Do we need a design doc or blueprint or similar? We haven't decided how we organize and handle larger features in the roadmap. @mkraposhin your thoughts?
A per VRF flow table? (Alexey Abashkin - T1)