Closed tylergu closed 1 year ago
Thanks for the write-ups @tylergu .
My intuition tells me that dynamic analysis is needed for Challenges 1 and 2.
My main question again is about the tradeoff -- if we don't want to do precise dynamic analysis as we discussed on the road, how big is the problem?
Or, putting in a different way, if we only do static analysis and some practices (e.g., enable all features), what is the limit we can push towards?
Let me elaborate more. I understand that there are hard cases that the return value is dependent on the inputs. But my reading is that it's not the common case. From your example, the most common cases are still very basic values like boolean
(e.g., Enabled
) and nil
.
The number I think is very important to know is that if we use the good practice (trying to enable all the features) and solve the simple boolean/nil values, do we solve 90% of the problem? Or 10%?
I understand the tendency of killing all the FPs. But, again, I think it should be done at least step by step. I'm too eager to start to build another dynamic analysis tool for multiple reasons.
For the version dependency, I would suggest not to attack it now by annotating the version field out. It's a trivial point and not too interesting TBH.
Yes, I think the majority of the problems can be solved if we can address the challenge 1 as you can see most of the false alarms are only dependent on challenge 1. We can address the challenge 1 by storing some information when we propagate through binary operations. Since most of the comparison are done against constants, we should be able to figure out most of the condition requirements statically.
In terms of challenge 2, I agree that it is very hard for static analysis to address it. I think we can first focus on addressing the challenge 1 using static analysis.
Program analysis for solving the field-dependency false alarms
We are trying to use static analysis to retrieve the dependency information among the CR fields. We do this by doing a taint analysis pass on all the variables which correspond to a CR field, and check if any if-statement condition has data-flow dependency on any CR field variable.
Then we perform dominator and post-dominator analysis on the if-statements to find the CR fields that are dominated by the if-statement.
However, there are two challenges we need to address:
1. Need condition information for each dependency relation
Only figuring out the dependency information is not enough for Acto to prune out the false alarms, Acto also needs the accurate conditions to enter each branch. For example, in the rabbitmq-operator example shown below, Acto needs to know that the dominees are only effective when the field
spec.secretBackend.vault.tls.PKIIssuerPath
is not""
. This requires us to keep track of the data-flow dependency when doing the taint analysis pass and figure out how to satisfy the branch conditions.The end result of this dependency analysis should be a map from the CR fields to the conditions needed to make the field effective. For example:
This means for the field
spec.tlsName
to be effective,spec.tlsEnabled
needs to betrue
andspec.vault
needs to benil
.Right now, our analysis only tells us:
2. Need to propogate through control-flow dependency when trying to figure out which if-condition depends on CR fields
Another problem is about the static analysis' propogation rule. In the analysis, we only care about data-flow dependency. We check if the if-statement's condition has data-flow dependency on some CR fields. But the problem arises when the data is a boolean, the control-flow dependency intermingles with data-flow dependency. For example as shown below, the return value of the function
MutualTLSEnabled
depends on(cluster.SecretTLSEnabled() && cluster.Spec.TLS.CaSecretName != "") || cluster.VaultTLSEnabled()
. In the block 2, it shows the return value has data dependency on the valuet1
which represents the fieldcluster.VaultTLSEnabled()
, but has no data dependency on other fields. So I think data-flow dependency is not enough here, we may also need control-flow dependency:rabbitmq-operator
rabbitmq-operator has 7 out of 8 false alarms caused by field dependency (challenge 1)
This is the code snippet that shows the dependency. It's clear that block 3 and its dominees are dominated by block 2, and block 4 post-dominates block 2. The fields ['spec', 'secretBackend', 'vault', 'tls', 'altNames'], ['spec', 'secretBackend', 'vault', 'tls', 'ipSans'] are only used in block 3 and its dominees.
We can see that block 2 is condition branch on the return value of the (*github.com/rabbitmq/cluster-operator/api/v1beta1.VaultSpec).TLSEnabled(t3) function call. The function body is shown below, and we can see that this function returns
true
if the fieldspec.TLS.PKIIssuerPath
is not empty string.percona-server-mongodb-operator
The exact number of false alarms caused by field dependency for mongodb-operator is out-of-date, I need to rerun it to get accurate number.
The last run was in May and 122 out of 364 false alarms are caused by field dependency.
52 out of 122 can be fixed by setting enabled to true
47 out of 122 are version dependencies
This operator is a mess. The biggest challenge to address field dependency in this operator is that: This operator encodes the CR version backward compatibility inside the operator code, which makes it super hard to track which functionality is supported in which version.
They compare version by calling a library function, which makes it very hard for us figure out the version dependency.
23 out of 122 are caused by some fields depend on the
replset.Storage.Engine
to be some exact engine type. (challenge 1)The source code looks like this:
You can see that the Storage.Engine has two options:
WiredTiger
andInMemory
. The switch statement at IR level is converted to nested if statements, so it is not a concern. While theInMemory
condition is trivial to manage, theWiredTiger
branch contains a lot of nested if conditions and makes the dependency much more complex.The IR code is below: You can see blocks 21 and 26 are where the switch conditions are. Although they are not compared to some string constants, they are compared against some constant variables and I think we can manage to extract that out.
The IR code
``` 21: if.then P:1 S:2 t114 = &replset.Storage [#11] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t115 = *t114 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t116 = &t115.Engine [#0] *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.StorageEngine t117 = *t116 github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.StorageEngine t118 = *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.StorageEngineWiredTiger github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.StorageEngine t119 = t117 == t118 bool if t119 goto 24 else 26 22: if.done P:3 S:2 t120 = phi [19: t110, 45: t315, 46: t330] #args []string t121 = (*github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.PerconaServerMongoDB).CompareVersion(m, "1.12.0":string) int t122 = t121 < 0:int bool if t122 goto 47 else 48 23: switch.done P:5 S:2 t123 = phi [38: t263, 25: t149, 26: t110, 43: t263, 42: t302] #args []string t124 = &replset.Storage [#11] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t125 = *t124 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t126 = &t125.DirectoryPerDB [#1] *bool t127 = *t126 bool if t127 goto 44 else 45 24: switch.body P:1 S:2 t128 = (*github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.PerconaServerMongoDB).CompareVersion(m, "1.12.0":string) int t129 = t128 < 0:int bool if t129 goto 29 else 28 25: switch.body P:1 S:1 t130 = &t0.Limits [#0] *k8s.io/api/core/v1.ResourceList t131 = *t130 k8s.io/api/core/v1.ResourceList t132 = &replset.Storage [#11] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t133 = *t132 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t134 = &t133.InMemory [#3] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecInMemory t135 = *t134 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecInMemory t136 = &t135.EngineConfig [#0] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecInMemoryEngineConfig t137 = *t136 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecInMemoryEngineConfig t138 = &t137.InMemorySizeRatio [#0] *github.com/percona/percona-server-mongodb-operator/pkg/util/numstr.NumberString t139 = (*github.com/percona/percona-server-mongodb-operator/pkg/util/numstr.NumberString).Float64(t138) float64 t140 = getWiredTigerCacheSizeGB(t131, t139, false:bool) float64 t141 = new [1]any (varargs) *[1]any t142 = &t141[0:int] *any t143 = make any <- float64 (t140) any *t142 = t143 t144 = slice t141[:] []any t145 = fmt.Sprintf("--inMemorySizeGB=...":string, t144...) string t146 = new [1]string (varargs) *[1]string t147 = &t146[0:int] *string *t147 = t145 t148 = slice t146[:] []string t149 = append(t110, t148...) []string jump 23 26: switch.next P:1 S:2 t150 = *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.StorageEngineInMemory github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.StorageEngine t151 = t117 == t150 bool if t151 goto 25 else 23 27: if.then P:1 S:2 t152 = new [2]string (varargs) *[2]string t153 = &t152[0:int] *string *t153 = "--enableEncryption":string t154 = &t152[1:int] *string *t154 = "--encryptionKeyFi...":string t155 = slice t152[:] []string t156 = append(t110, t155...) []string t157 = &m.Spec [#2] *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.PerconaServerMongoDBSpec t158 = &t157.Mongod [#8] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpec t159 = *t158 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpec t160 = &t159.Security [#4] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecSecurity t161 = *t160 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecSecurity t162 = &t161.EncryptionCipherMode [#3] *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodChiperMode t163 = *t162 github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodChiperMode t164 = t163 != "":github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodChiperMode bool if t164 goto 30 else 28 28: if.done P:4 S:2 t165 = phi [24: t110, 29: t110, 27: t156, 30: t192] #args []string t166 = new k8s.io/apimachinery/pkg/api/resource.Quantity (limit) *k8s.io/apimachinery/pkg/api/resource.Quantity t167 = &t0.Limits [#0] *k8s.io/api/core/v1.ResourceList t168 = *t167 k8s.io/api/core/v1.ResourceList t169 = t168["memory":k8s.io/api/core/v1.ResourceName],ok (k8s.io/apimachinery/pkg/api/resource.Quantity, bool) t170 = extract t169 #0 k8s.io/apimachinery/pkg/api/resource.Quantity *t166 = t170 t171 = extract t169 #1 bool if t171 goto 33 else 32 29: cond.true P:1 S:2 t172 = &m.Spec [#2] *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.PerconaServerMongoDBSpec t173 = &t172.Mongod [#8] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpec t174 = *t173 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpec t175 = &t174.Security [#4] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecSecurity t176 = *t175 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecSecurity t177 = &t176.EnableEncryption [#1] **bool t178 = *t177 *bool t179 = *t178 bool if t179 goto 27 else 28 30: if.then P:1 S:1 t180 = &m.Spec [#2] *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.PerconaServerMongoDBSpec t181 = &t180.Mongod [#8] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpec t182 = *t181 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpec t183 = &t182.Security [#4] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecSecurity t184 = *t183 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecSecurity t185 = &t184.EncryptionCipherMode [#3] *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodChiperMode t186 = *t185 github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodChiperMode t187 = changetype string <- github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodChiperMode (t186) string t188 = "--encryptionCiphe...":string + t187 string t189 = new [1]string (varargs) *[1]string t190 = &t189[0:int] *string *t190 = t188 t191 = slice t189[:] []string t192 = append(t156, t191...) []string jump 28 31: if.then P:1 S:1 t193 = &t0.Limits [#0] *k8s.io/api/core/v1.ResourceList t194 = *t193 k8s.io/api/core/v1.ResourceList t195 = &replset.Storage [#11] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t196 = *t195 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t197 = &t196.WiredTiger [#5] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTiger t198 = *t197 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTiger t199 = &t198.EngineConfig [#1] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTigerEngineConfig t200 = *t199 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTigerEngineConfig t201 = &t200.CacheSizeRatio [#0] *github.com/percona/percona-server-mongodb-operator/pkg/util/numstr.NumberString t202 = (*github.com/percona/percona-server-mongodb-operator/pkg/util/numstr.NumberString).Float64(t201) float64 t203 = getWiredTigerCacheSizeGB(t194, t202, true:bool) float64 t204 = new [1]any (varargs) *[1]any t205 = &t204[0:int] *any t206 = make any <- float64 (t203) any *t205 = t206 t207 = slice t204[:] []any t208 = fmt.Sprintf("--wiredTigerCache...":string, t207...) string t209 = new [1]string (varargs) *[1]string t210 = &t209[0:int] *string *t210 = t208 t211 = slice t209[:] []string t212 = append(t165, t211...) []string jump 32 32: if.done P:3 S:2 t213 = phi [28: t165, 33: t165, 31: t212] #args []string t214 = &replset.Storage [#11] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t215 = *t214 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecStorage t216 = &t215.WiredTiger [#5] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTiger t217 = *t216 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTiger t218 = &t217.CollectionConfig [#0] **github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTigerCollectionConfig t219 = *t218 *github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTigerCollectionConfig t220 = t219 != nil:*github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1.MongodSpecWiredTigerCollectionConfig bool if t220 goto 34 else 35 ```xtradb-operator
xtradb-operator has 175 out of 308 false alarms caused by field dependency
60 of 175 can be fixed by setting enabled to
true
115 of 175 are caused by two mutual exclusive functionalities (challenge 1)
xtradb-operator supports two kinds of proxies: HAProxy and ProxySQL. These two proxies are mutual exlusive, only one can be enabled at a time. So our
turn all boolean variables to true
strategy won't work for this operator.We will have to use static analysis to figure out that all
spec.HAProxy.*
fields depend on thespec.HAProxy.Enabled == true
.tidb-operator
tidb-operator has 10 false alarms caused by field dependency.
2 alarms are caused because some fields depend on the return value of the
IsTLSClusterEnabled
function. (challenege 1 & 2)As shown below, the return value of the function
IsTLSClusterEnabled
depends on values of two fieldstc.Spec.TLSCluster != nil && tc.Spec.TLSCluster.Enabled
However, the return value in the block 2 only has data dependency on valuet8
which representstc.Spec.TLSCluster.Enabled
.t9
is set to false if the control flow comes from block 0.2 alarms caused because
spec.slowLogTailer.image
depends onspec.helper.image == nil
which can directly retrieved (challenege 1)2 alarms caused because some fields depend on the return value of
ShouldSeparateRocksDBLog
function (challenege 1 & 2)As shown below, this
ShouldSeparateRocksDBLog
returnstrue
ifspec.tidb.SeparateRocksDBLog
isnil
, otherwise, it returns the value ofspec.tidb.SeparateRocksDBLog
.We can not accurately handle this when
spec.tidb.SeparateRocksDBLog
isnil
because the return value only has control-flow dependency onspec.tidb.SeparateRocksDBLog
4 alarms are caused because
spec.tikv.failover.recoverByUID
depends on some runtime conditionsThis field only takes effect when tidb-operator performs a recover. If recover never happens, this field does not have effect. This false alarm can not be handled by our static analysis.