xlab-uiuc / IDoCT

Illinois Dataset of Configuration Tests
3 stars 23 forks source link

[Add] add several tests for parameters of two test mehods in hadoop, fix a bug in ctest running script #4

Closed qsdrqs closed 2 years ago

qsdrqs commented 2 years ago

Hi, I noticed the recruitment for the summer project yesterday and have just submitted my application, hope that there are still open positions :)

This PR has two commits. The first commit is a bug fixing. Because the run_single_ctest.py script will fail when the value of parameters contains "=". In hadoop, the hadoop.user.group.static.mapping.overrides parameter of org.apache.hadoop.security.authentication.server.TestProxyUserAuthenticationFilter#testFilter will take equal signs in parameter value and this will cause the original program to fail. So this commit fix that problem.

In the second commit, I tested all parameters in org.apache.hadoop.security.ssl.TestSSLFactory#testServerWeakCiphers and org.apache.hadoop.security.authentication.server.TestProxyUserAuthenticationFilter#testFilter, proved that the CTest is both correct and effective. I didn't find any tests that will cause CTest fails yet, so if you give me some tests that make CTest fail, I can try to figure out how to modify the tests.

One interesting thing I noticed when running the parameter tests is that there are two BAD, PASS tests in my tests. They are

hadoop.security.groups.cache.background.reload.threads,org.apache.hadoop.security.authentication.server.TestProxyUserAuthenticationFilter#testFilter, -1, BAD, PASS
hadoop.security.groups.cache.warn.after.ms,org.apache.hadoop.security.authentication.server.TestProxyUserAuthenticationFilter#testFilter, -1, BAD, PASS

Time million second and thread numbers can be negative when setting the parameter in tests and these will not cause the failure of tests. These are caused in hadoop-common-project/hadoop-commo/src/main/java/org/apache/hadoop/security/Groups.java, where Hadoop directly convert the configuration to int or long, regardless of whether they are in the correct range. The code pieces shows like this:

....
    warningDeltaMs =
      conf.getLong(CommonConfigurationKeys.HADOOP_SECURITY_GROUPS_CACHE_WARN_AFTER_MS,
        CommonConfigurationKeys.HADOOP_SECURITY_GROUPS_CACHE_WARN_AFTER_MS_DEFAULT);
....
    reloadGroupsThreadCount  =
      conf.getInt(
          CommonConfigurationKeys.
              HADOOP_SECURITY_GROUPS_CACHE_BACKGROUND_RELOAD_THREADS,
          CommonConfigurationKeys.
              HADOOP_SECURITY_GROUPS_CACHE_BACKGROUND_RELOAD_THREADS_DEFAULT);
....

I think it might be a bug or an assumption that users won't use incorrect range in config. But I still think that there should be an exception to handle it in tests.

shuaiwang516 commented 2 years ago

Thanks for the PR @qsdrqs

The first commit is a bug fixing.

Good catch. For the fix it would be better to split param and value by the first index of = since the value of parameter may contain multiple =. (e.g., hadoop.configured.node.mapping=n1=/r1,n2=/r2)

I didn't find any tests that will cause CTest fails yet, so if you give me some tests that make CTest fail, I can try to figure out how to modify the tests.

You can try to find tests: (1) uses conf.set() to set some configuration parameters; (2) uses assert to make it can only pass under specific configuration value. You can think of how to rewrite them to ctest while still keeping the testing logic.

One interesting thing I noticed when running the parameter tests is that there are two BAD, PASS tests in my tests.

This is interesting. To confirm it you can check the actual value used in the test -- whether it is still -1 or somehow be converted to a "correct" value. If it is still -1, why the test still passes? What is the testing logic here? Why -1 does not cause any failure?

Looking forward to your updates.

tianyin commented 2 years ago

Hooray!!! This PR is damn awesome! It has all the cool things: a bug fix in the ctest script, generated ctests, and a bug in the Hadoop code!

qsdrqs commented 2 years ago

Thanks for the reply!

Good catch. For the fix it would be better to split param and value by the first index of = since the value of parameter may contain multiple =. (e.g., hadoop.configured.node.mapping=n1=/r1,n2=/r2)

Yes. So this PR really did that, it only parses the first index of "=" and leaves all other parts of the argument as parameters and values.

This is interesting. To confirm it you can check the actual value used in the test -- whether it is still -1 or somehow be converted to a "correct" value. If it is still -1, why the test still passes? What is the testing logic here? Why -1 does not cause any failure?

I just took a deep look into Hadoop sources and figured out what happened. In the second test, the source code logic about hadoop.security.groups.cache.warn.after.ms shows as

...
      long deltaMs = endMs - startMs ;
      UserGroupInformation.metrics.addGetGroups(deltaMs);
      if (deltaMs > warningDeltaMs) {
        LOG.warn("Potential performance problem: getGroups(user=" + user +") " +
          "took " + deltaMs + " milliseconds.");
      }
...

It looks like a log filter so that when deltaMs is more than warningDeltaMs then it will emit a warning, and warningDeltaMs is set by the parameter above. When warningDeltaMs <= 0, these codes will emit the warning all the time. Although setting this parameter less than 0 will not break the logic of the program, it might be better to stipulate that this value should not be minus because it's a little confusing when setting it to negative.

For the first test, it's a little complex. The related code that uses this parameter is this:

      if (reloadGroupsInBackground) {
...
        ThreadPoolExecutor parentExecutor =  new ThreadPoolExecutor(
            reloadGroupsThreadCount,
            reloadGroupsThreadCount,
            60,
            TimeUnit.SECONDS,
            new LinkedBlockingQueue<>(),
            threadFactory);
...
      }

In this code piece, the reloadGroupsInBackground is set by hadoop.security.groups.cache.background.reload and hadoop.security.groups.cache.background.reload.threads parameter will be valid only if hadoop.security.groups.cache.background.reload is set to true, but this is false as default.

So this BAD, PASS is caused by the failure of "config dependency". Although CTest can use multiple parameters in one test, I don't know whether the parameters are dependent before looking into the source code so I tested it individually. So that causes this false negative. Perhaps we need some way to figure out the dependency of parameters before setting them, but I don't come up with any good idea yet...

shuaiwang516 commented 2 years ago

This looks good to me and ready to be merged.

@qsdrqs could you please resolve the conflicts? I will merge your PR once you've done it.

qsdrqs commented 2 years ago

resolved!

You can check whether it's ready to be merged.