np-guard / netpol-analyzer

A Golang library for analyzing k8s connectivity-configuration resources (a.k.a. network policies)
Apache License 2.0
9 stars 2 forks source link

handling selectors with matchexpressions (fixed) #377

Closed shireenf-ibm closed 1 week ago

shireenf-ibm commented 1 month ago

236

task:

  • [ ] support selectors with matchExpression (all operators)

in this PR:


more things where done in this PR :

adisos commented 1 month ago

general comment: the expression selectors could be represented as LabelSelectorRequirement from pkg go\pkg\mod\k8s.io\apimachinery@v0.29.2\pkg\apis\meta\v1\types.go, instead of converting to map[string]string and then re-converting to a string? why do we need our own conversion? could just add a separate field for such selectors of this type instead?

shireenf-ibm commented 1 month ago

general comment: the expression selectors could be represented as LabelSelectorRequirement from pkg go\pkg\mod\k8s.io\apimachinery@v0.29.2\pkg\apis\meta\v1\types.go, instead of converting to map[string]string and then re-converting to a string? why do we need our own conversion? could just add a separate field for such selectors of this type instead?

I thought about this either but found that it will cost multiple changes and will not differ that much for the "special" cases 1 we can add fields of "LabelSelectorRequirement" to the k8s.Pod and eval.RepresentativePeer interfaces and consider them each time needed; but still the ExposedPeer will contain the representative labels as `map[string]string (so preferred to be consistent from beginning) 2 operator In case; I preferred to convert the expression into labels of for each value (and so a new representative peer is created for each) In case of the other operators it was only to convert the to a single : where the is "special" and needs to be "compared" specifically , (same comparisons would be done in case we use the LabelSelectorRequirement )

shireenf-ibm commented 1 month ago

some comments regarding the last commit: current code changes:

a suggestion: (not implemented here)

shireenf-ibm commented 1 month ago

attaching a list of the tests that were added in this PR, later will update the file with all tests of exposure-analysis exposure_analysis_tests.csv

shireenf-ibm commented 1 month ago

some review tasks:

  1. eval pkg:

"ScanPolicyRulesAndUpdateGeneralConns"

"getSelectorsAndUpdateGeneralConns(FromRules)"

  1. connlist.go:

exposure analysis related :

shireenf-ibm commented 4 weeks ago

hi @adisos , regarding splitting namespaces with policies in AddObjectsForExposureAnalysis and having the resolveSingleMissingNamespace calls twice (when exposure-analysis is on/off).

I found that we really should split namespaces at the beginning (with policies)

I have committed an example that will give us a misleading result if the namespaces were not split at the beginning.

in the example we have: (example found in tests/test_exposure_with_real_pod_and_namespace )

  1. a real namespace in the manifests with label name:ns2 and we have a real pod in that namespace.
  2. the netpol's rule exactly match the namespace and pod labels (of the namespace and the pod).

here are the results with and without splitting the namespaces :

  1. with splitting the namespaces, we get the correct result (no exposure inferred from the rule since we have an exact match):

results_with_namespaces_split

  1. without split the namespaces, we get an exposure result with the unnecessary line: hello-world/workload-a[Deployment] <= [namespace with {name=ns2}]/[pod with {app=b-app}] : All Connections

results_without_namespaces_split

/////////// an explanation:

* [ ]   func (pe *PolicyEngine) AddObjects(objects []parser.K8sObjec :
  why should we add namespaces in the beginning, and not only policy objects?
  • if we don't split the namespaces at the beginning, but the policy engine gets the pod object first (the parsed yaml object); then the policy engine will resolve the missing namespace of the pod with only the label of default k8s name (which is not name). and real namespace labels of the real pod will not contain the actual real labels that are in the namespace object yaml. and the redundant matching representative peer won't be removed.

/////////////////////// regarding calling the resolveSingleMissingNamespace twice:

* [ ]  in function :

* func (pe *PolicyEngine) addObjectsByKind(objects []parser.K8sObject) error ,
  why:
  ```
         if !pe.exposureAnalysisFlag { // for exposure analysis; this already done
      return pe.resolveMissingNamespaces()
   }
      ```
  ```

* resolveSingleMissingNamespace() : change so that it is called once instead of twice in different places for exposure analysis on/off

-- so when the exposure-analysis is on : the namespaces are inserted to the policy-engine first; and so we can use the func resolveSingleMissingNamespace when inserting a pod/workload; because if its namespace not found in the policy-engine; it is for sure not found in the resources too

-- but when exposure-analysis is off : when adding a pod/workload to policy-engine; its namespace might be in the resources but was not added yet (the objects list is not sorted and we can not predict what resource comes first); so we can't call resolveSingleMissingNamespace until all k8s objects from the resource were added to the policy-engine (otherwise, a real namespace with some specified labels for it may be missed)

adisos commented 3 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

shireenf-ibm commented 3 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

I tried to start all dirs of tests with test_exposure , you may also see the tests in connlist_test.go which run with the exposureFlag. all output files of tests that run with exposure-analysis (on) will start with exposure_ (determined this in code )

adisos commented 3 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

I tried to start all dirs of tests with test_exposure , you may also see the tests in connlist_test.go which run with the exposureFlag. all output files of tests that run with exposure-analysis (on) will start with exposure_ (determined this in code )

can you be consistent also with test dirs, so that they all have a common prefix? some start with "test_new_namespace_conn_and_entire_cluster_with_matching_pod" , some with "testexposure", some with "test_egress_exposure", and maybe others...

adisos commented 3 weeks ago

Please add a short summary of the implementation flow in the issue description.

shireenf-ibm commented 2 weeks ago

Please add a short summary of the implementation flow in the issue description.

done

shireenf-ibm commented 2 weeks ago

is there a convention for connlist tests with exposure analysis? how can one identify which tests are relevant to exposure analysis?

I tried to start all dirs of tests with test_exposure , you may also see the tests in connlist_test.go which run with the exposureFlag. all output files of tests that run with exposure-analysis (on) will start with exposure_ (determined this in code )

can you be consistent also with test dirs, so that they all have a common prefix? some start with "test_new_namespace_conn_and_entire_cluster_with_matching_pod" , some with "testexposure", some with "test_egress_exposure", and maybe others...

done , described the changes in the PR's description above too