shireenf-ibm commented 1 month ago

236

task:

[ ] support selectors with matchExpression (all operators)

in this PR:

added support of representative peers which are inferred from rules with matchExpressions
it was determined that representative peers inferred from rules with matchExpressions will not be removed if there are real peer matching them (not redundant also: peers matching all-namespaces / peers matching all-pods in a namespace)
it was also determined that when computing allowed connections, a rule matches a representative peer if :
- the rule is empty (matches all)
- the rule points to same reference of the selectors of the representative peer
- the rule and the selectors of the representative peer have same requirements (1 way containment is not a match)
cancelled creating representative namespaces ( if the representative peer has the same namespace of a policy (inferred from a rule with nil namespaceSelector it will be generated in that namespace, otherwise it will have a nil namespace object
the representative namespace and pod selectors of a representative peer are stored in its Pod object
the RepresentativePeer struct was eliminated , a representative peer is a WorkloadPeer with kind == RepresentativePeer
changing the type of NamespacLabels and PodLabels of an ExposedPeer to be LabelSelector so it may include both matchLabels and matchExpressions

more things where done in this PR :

fixing multiple typos in netpol-analyzer's code
renaming all new exposure tests dirs to start with exposure_ , and renaming expected output's suffixes in this way :
- if the test run with exposure-analysis flag off : then a connlist output is generated -> expected output suffix is connlist_output.<format>
- if the test run with exposure-analysis flag on : the the suffix of expected output is exposure_output.<format> to identify that the output contains also exposure results. this way also helps to differentiate when a test runs with the flag on/off (even for exposure_<test_dir>
- renaming some functions/variables for more readability
- updating some functions / in files documentations for clarity
- unit tests for the new functionality
- adding multiple new (connlist) tests with the new feature

adisos commented 1 month ago

general comment: the expression selectors could be represented as LabelSelectorRequirement from pkg go\pkg\mod\k8s.io\apimachinery@v0.29.2\pkg\apis\meta\v1\types.go, instead of converting to map[string]string and then re-converting to a string? why do we need our own conversion? could just add a separate field for such selectors of this type instead?

shireenf-ibm commented 1 month ago

general comment: the expression selectors could be represented as LabelSelectorRequirement from pkg go\pkg\mod\k8s.io\apimachinery@v0.29.2\pkg\apis\meta\v1\types.go, instead of converting to map[string]string and then re-converting to a string? why do we need our own conversion? could just add a separate field for such selectors of this type instead?

I thought about this either but found that it will cost multiple changes and will not differ that much for the "special" cases 1 we can add fields of "LabelSelectorRequirement" to the k8s.Pod and eval.RepresentativePeer interfaces and consider them each time needed; but still the ExposedPeer will contain the representative labels as `map[string]string (so preferred to be consistent from beginning) 2 operator In case; I preferred to convert the expression into labels of for each value (and so a new representative peer is created for each) In case of the other operators it was only to convert the to a single : where the is "special" and needs to be "compared" specifically , (same comparisons would be done in case we use the LabelSelectorRequirement )

shireenf-ibm commented 1 month ago

some comments regarding the last commit: current code changes:

since the potential namespaces and pods a peer may be exposed to can contain either labels map or list of requirements I used the LabelSelector struct to represent them.(it includes both map[string]string and []LabelSelectorRequirement)
however , in order to print the potential labels and requirements string I prefered to use the labels string func and LabelSelectorRequirement.String() func rather than the LabelSelector.String() ; in order to get a neater output.

a suggestion: (not implemented here)

since k8s.namespace contains both requirements and labels; we have the option to remove the RepresentativePeer struct and use PodPeer struct to represent representative peer, by differentiating the kind value of the peer; (if this is desired, i prefer to do in a different PR)

shireenf-ibm commented 1 month ago

attaching a list of the tests that were added in this PR, later will update the file with all tests of exposure-analysis exposure_analysis_tests.csv

shireenf-ibm commented 1 month ago

some review tasks:

eval pkg:

[x] RepresentativePeer - type : consider eliminate, and use WorkloadPeer with kind() instead
[x] func (pe *PolicyEngine) AddObjects :
consider renaming, "AddObjectsByOrder", "AddObjectsForExposureAnalysis"
documentation
[x] fix typos in documentation/code (for example pkg\netpol\eval\resources.go )
[x] if needed to add namespaces, update comment: "// 2. namespaces: so when upserting workloads, we'll be able to refine correctly representativePeers with // namespace name/ labels similar to those belonging the workloads' namespace " => " check if a generated representative peer should be removed, if its labels and namespace correspond to a real pod "
[x] in function :

func (pe *PolicyEngine) addObjectsByKind(objects []parser.K8sObject) error , why:

      if !pe.exposureAnalysisFlag { // for exposure analysis; this already done
    return pe.resolveMissingNamespaces()
}

resolveSingleMissingNamespace() : change so that it is called once instead of twice in different places for exposure analysis on/off
[x] func (pe *PolicyEngine) AddObjects(objects []parser.K8sObjec : why should we add namespaces in the beginning, and not only policy objects?
[x] ScanPolicyRulesForGeneralConnsAndRepresentativePeers -> consider renaming:

"ScanPolicyRulesAndUpdateGeneralConns"

[x] "GeneralConns" -> consider renaming to "ExposedGeneralConns" / "ExposedWideConns" Update:

type PolicyExposedGeneralConns struct {
// AllDestinationsConns contains the maximal connection-set which the policy's rules allow to all destinations
// (all namespaces, pods and IP addresses)
AllDestinationsConns *common.ConnectionSet
// EntireClusterConns contains the maximal connection-set which the policy's rules allow to all namespaces in the cluster
EntireClusterConns *common.ConnectionSet
}

rename the struct to PolicyExposureWithoutSelectors and the fields to ExternalExposure instead of AllDestinationsConns and ClusterWideExposure instead of EntireClusterConns

update the comments, conns from empty selectors are included
[x] "handleRulesSelectors" -> consider renaming to

"getSelectorsAndUpdateGeneralConns(FromRules)"

[x] consider removing PolicyNsFlag
[x] generateRepresentativePeers() => generateNewPodName()..
fix naming, no need for index
add comment explaining why naming is not important / used
[x] func addRepresentativePod : remove call to resolveSingleMissingNamespace() ??
[x] "extractLabelsAndRefineRepresentativePeers" -> rename to "RemoveRedundantRepresentativePeers" // when possible, remove...
[x] func allAllowedConnectionsBetweenPeers: remove ingressSet, egressSet

connlist.go:

[x] connPeers => consider renaming. ("allPeers", "realAndFakePeers", "realAndRepresentativePeers")
[x] func: checkIfP2PConnOrExposureConn => rename

exposure analysis related :

[x] update (where needed) documentation and variables names and funcs names with the new exposure terms

shireenf-ibm commented 4 weeks ago

hi @adisos , regarding splitting namespaces with policies in AddObjectsForExposureAnalysis and having the resolveSingleMissingNamespace calls twice (when exposure-analysis is on/off).

I found that we really should split namespaces at the beginning (with policies)

I have committed an example that will give us a misleading result if the namespaces were not split at the beginning.

in the example we have: (example found in tests/test_exposure_with_real_pod_and_namespace )

a real namespace in the manifests with label name:ns2 and we have a real pod in that namespace.
the netpol's rule exactly match the namespace and pod labels (of the namespace and the pod).

here are the results with and without splitting the namespaces :

with splitting the namespaces, we get the correct result (no exposure inferred from the rule since we have an exact match):

results_with_namespaces_split

without split the namespaces, we get an exposure result with the unnecessary line: hello-world/workload-a[Deployment] <= [namespace with {name=ns2}]/[pod with {app=b-app}] : All Connections

results_without_namespaces_split

/////////// an explanation:

* [ ]   func (pe *PolicyEngine) AddObjects(objects []parser.K8sObjec :
  why should we add namespaces in the beginning, and not only policy objects?
if we don't split the namespaces at the beginning, but the policy engine gets the pod object first (the parsed yaml object); then the policy engine will resolve the missing namespace of the pod with only the label of default k8s name (which is not name). and real namespace labels of the real pod will not contain the actual real labels that are in the namespace object yaml. and the redundant matching representative peer won't be removed.

however if we split real namespaces objects from the manifests at the beginning (with the policies); then the namespace with all its real labels will be in the policy-engine when we get to the real pod and those labels will be considered and the redundant representative-peer will be removed.

/////////////////////// regarding calling the resolveSingleMissingNamespace twice:

* [ ]  in function :

* func (pe *PolicyEngine) addObjectsByKind(objects []parser.K8sObject) error ,
  why:
  ```
         if !pe.exposureAnalysisFlag { // for exposure analysis; this already done
      return pe.resolveMissingNamespaces()
   }
      ```
  ```

* resolveSingleMissingNamespace() : change so that it is called once instead of twice in different places for exposure analysis on/off

-- so when the exposure-analysis is on : the namespaces are inserted to the policy-engine first; and so we can use the func resolveSingleMissingNamespace when inserting a pod/workload; because if its namespace not found in the policy-engine; it is for sure not found in the resources too

-- but when exposure-analysis is off : when adding a pod/workload to policy-engine; its namespace might be in the resources but was not added yet (the objects list is not sorted and we can not predict what resource comes first); so we can't call resolveSingleMissingNamespace until all k8s objects from the resource were added to the policy-engine (otherwise, a real namespace with some specified labels for it may be missed)

adisos commented 3 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

shireenf-ibm commented 3 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

I tried to start all dirs of tests with test_exposure , you may also see the tests in connlist_test.go which run with the exposureFlag. all output files of tests that run with exposure-analysis (on) will start with exposure_ (determined this in code )

adisos commented 3 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

I tried to start all dirs of tests with test_exposure , you may also see the tests in connlist_test.go which run with the exposureFlag. all output files of tests that run with exposure-analysis (on) will start with exposure_ (determined this in code )

can you be consistent also with test dirs, so that they all have a common prefix? some start with "test_new_namespace_conn_and_entire_cluster_with_matching_pod" , some with "testexposure", some with "test_egress_exposure", and maybe others...

adisos commented 3 weeks ago

Please add a short summary of the implementation flow in the issue description.

shireenf-ibm commented 2 weeks ago

Please add a short summary of the implementation flow in the issue description.

done

shireenf-ibm commented 2 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

I tried to start all dirs of tests with test_exposure , you may also see the tests in connlist_test.go which run with the exposureFlag. all output files of tests that run with exposure-analysis (on) will start with exposure_ (determined this in code )

can you be consistent also with test dirs, so that they all have a common prefix? some start with "test_new_namespace_conn_and_entire_cluster_with_matching_pod" , some with "testexposure", some with "test_egress_exposure", and maybe others...

done , described the changes in the PR's description above too

np-guard / netpol-analyzer

handling selectors with matchexpressions (fixed) #377

236

a suggestion: (not implemented here)