uber-common / jvm-profiler

JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
Other
1.78k stars 342 forks source link

Improve yaml property access #21

Closed baghelamit closed 6 years ago

baghelamit commented 6 years ago

It will be better if yaml properties can be accessed using dot (".") notation. Current implementation stores Map and List from yaml in String format which is not very convenient while using these properties. Consider below yaml properties.

key1: value1
key3:
    key3a: value3a
    key3b:
    - value3b
    - value3c
override:
  override1: 
    key1: value11
    key3:
      key3a: value33a

Currently YamlConfigProvider stores them in Map<String, Map<String, List<String>>> which is like below.

{
         = {
        key1 = [value1],
        key3 = [{
                key3a = value3a,
                key3b = [value3b, value3c]
            }
        ]
    },
    override1 = {
        key1 = [value11],
        key3 = [{
                key3a = value33a
            }
        ]
    }

The proposed change will store them in same Map<String, Map<String, List<String>>> which will be like below.

{
         = {
        key1 = [value1],
        key3.key3a = [value3a],
        key3.key3b[0] = [value3b],
        key3.key3b[1] = [value3c]
    },
    override1 = {
        override.override1.key1 = [value11],
        override.override1.key3.key3a = [value33a]
    }
}

This is similar format how Spring loads yaml file. This change will still store the values in List<String> so it won't break existing functionalities. I have updated YamlConfigProviderTest class for this change.

CLAassistant commented 6 years ago

CLA assistant check
All committers have signed the CLA.

hiboyang commented 6 years ago

Hi Baghelamit, thanks for the change here! We tried to avoid complex configuration structure in the config file, because we want to support config through both command argument and config file.

For example, the command argument could be: -javaagent:target/jvm-profiler-0.0.9.jar=tag=mytag,metricInterval=5000. If we make the config file too complex, it will be hard to support that kind of format in command argument.

Feel free to suggest if you come out some format which is easy to support in both command argument and config file.

baghelamit commented 6 years ago

Thank you @boy-uber for the response. The proposed change will not affect the config passed from command argument. Let me explain the issue with current YamlConfigProvider further so we can come to some conclusion. Consider this simple yaml file.

server1:
  host: 127.0.0.1
  port: 8080
  username: user1
  password: password1

When I use it with existing YamlConfigProvider class, "Arguments.runConfigProvider()" method stores yaml values in "rawArgValues" (Map<String, List>) along with other command arguments like below.

{
    "tag" = ["mytag"],
    "sampleInterval" = ["100"],
    "metricInterval" = ["5000"],
    "configFile" = ["/opt/test.yaml"],
    "configProvider" = ["com.uber.profiling.YamlConfigProvider"],
    "reporter" = ["com.uber.profiling.reporters.ConsoleOutputReporter"],
    "server1" = ["{
            host = 127.0.0.1,
            port = 8080,
            username = user1,
            password = password1
        }"
    ],
    "durationProfiling" = ["com.uber.profiling.examples.HelloWorldApplication.publicSleepMethod"],
    "argumentProfiling" = ["com.uber.profiling.examples.HelloWorldApplication.publicSleepMethod.1"]
}

Here "{host=127.0.0.1, port=8080, username=user1, password=password1}" is single string value in the List. It is not easy to extract values for host, port, username and password properties for server1. I have to write some regex to extract values from this string representation of yaml Map.

With the proposed change for YamlConfigProvider class, "Arguments.runConfigProvider()" method will store yaml values in "rawArgValues" (Map<String, List>) along with other command arguments like below.

{
    "tag" = ["mytag"],
    "sampleInterval" = ["100"],
    "metricInterval" = ["5000"],
    "configFile" = ["/opt/test.yaml"],
    "configProvider" = ["com.uber.profiling.YamlConfigProvider"],
    "reporter" = ["com.uber.profiling.reporters.ConsoleOutputReporter"],
    "server1.host" = ["127.0.0.1"],
    "server1.port" = ["8080"],
    "server1.username" = ["user1"],
    "server1.password" = ["password1"],
    "durationProfiling" = ["com.uber.profiling.examples.HelloWorldApplication.publicSleepMethod"],
    "argumentProfiling" = ["com.uber.profiling.examples.HelloWorldApplication.publicSleepMethod.1"]
}

Here you can see that yaml properties are stored in a similar format what it is used for command arguments. Now as each values from yaml file are stored as single value in a List, "Arguments.getArgumentSingleValue" method can be used for them as well.

hiboyang commented 6 years ago

Hi @baghelamit, thanks for your explanation and detailed example! And thanks for trying to keep backward compatible :)

Originally we wanted to keep the config simple (as flat as possible). Now I kind of feel it is also good to support multiple level config like what you did in this code change. I added some comments there since the way you handle array values will break old code. Would you change that part?

baghelamit commented 6 years ago

Thanks @boy-uber. Yes, we can keep the processing of array same as before. I incorporated the review comments and committed the latest files. Please check.

hiboyang commented 6 years ago

@baghelamit thanks for the change! I added some other comments to make some small improvements.

baghelamit commented 6 years ago

Refactored the code and moved the logic to check for Map instance type inside addConfig method. Removed addMapConfig method.

hiboyang commented 6 years ago

Thanks @baghelamit for the refactoring! It looks pretty good now. I just added a minor comment so we could keep empty list as config value.

baghelamit commented 6 years ago

Hi @boy-uber, I don't think there will be any issue because of that line of code. Consider below yaml

key3:
    key3a: value3a
    key3b:
    - value3b
    - value3c

This yaml is stored as "key3.key3a"=["value3a"], "key3.key3b"=["value3b", "value3c"], "key3"=[]. Here "key3"=[] is invalid key-value and it is getting stored because "configValueList" has been created at the beginning.

List<String> configValueList = configMap.computeIfAbsent(key, k -> new ArrayList<>());

To remove invalid key-value "key3"=[] from configMap, I used "configMap.values().removeIf(List::isEmpty)"

Now as far as empty value in yaml file is concerned, "snakeyaml" ignores key with no value. So below key will be ignored.

key4:

But below key-value pairs will be stored as "key4"=[""] and "key5"=[""] respectively. And as [""] is not an empty list, it will not be removed from configMap by that line of code.

key4: ''

key5: ""
hiboyang commented 6 years ago

Thanks for the explanation. I think a better way is to put following

Map<String, List<String>> configMap = config.computeIfAbsent(override, k -> new HashMap<>());
         List<String> configValueList = configMap.computeIfAbsent(key, k -> new ArrayList<>());          List<String> configValueList = configMap.computeIfAbsent(key, k -> new ArrayList<>());

inside

if (value instanceof List) {
 ...
} else if (value instanceof Object[]) {
 ...
}

because we only need to add the key into configMap if the value is not a map. In that case we do not have the issue to remove empty list.

baghelamit commented 6 years ago

We only need to move "configValueList" inside if block. I committed the modified file.

hiboyang commented 6 years ago

Yes! Thanks so much for going through all these rounds of updating code! It looks pretty good now. Let me merge your pull request!