smartcat-labs / ranger

Ranger is contextual data generator used to make sensible data for integration tests or to play with it in the database
https://www.smartcat.io/products/ranger-contextual-data-and-load-generator
Apache License 2.0
59 stars 11 forks source link

Support for UserDefinedValue can be fetched from configured json DataSource #183

Open hemanthkilari opened 6 years ago

hemanthkilari commented 6 years ago

Created support for UsedDefinedValue to be fetched from a Json or http endpoint along with option to apply filter on the Datasource. This requirement is needed for me, when I tried creating berserker configurations in the form of a workflow for a Rest service having dependent rest Endpoints. And I believe that this is a requirement that most berserker users like.

Method Usage: userDefinedValue("DataSourceName","JsonPathQuery","refreshIntervalInSeconds","regExpressionToFetchExpectedStringFromFetchedField")

DataSourceName: Can be a name of JSON file available in classpath/ A Rest endpoint that return a JSON response

JsonPathQuery: JSON path query to extract a Filed from specified Data Source. For semantics, please refer to https://github.com/json-path/JsonPath

refreshIntervalInSeconds: To refresh the data at configured intervals if the source support dynamic data retrieval.

regExpressionToFetchExpectedStringFromFetchedField: To support the case where the requirement is not complete field and only specific part.

example definitions: Sample JSON: { "firstName": "John", "lastName" : "doe", "age" : 26, "address" : { "streetAddress": "naist street", "city" : "Nara", "postalCode" : "630-0192" }, "phoneNumbers": [ { "type" : "iPhone", "number": "0123-4567-8888" }, { "type" : "home", "number": "0123-4567-8910" } ] }

userDefinedValue("resources","$.phoneNumbers[].type","100"," ") userDefinedValue("http:/url/resources","$.phoneNumbers[].type","100"," ")

above Value Definition will evaluate to Values "iPhone", "home"

vajda commented 6 years ago

Hi Hemanth, Thanks for your interest in Ranger project. I'm not sure I understand your need. You need HTTP (JSON) DataSource for Berserker the same way as CSV DataSource exists within Ranger project?

hemanthkilari commented 6 years ago

@vajda Yes I need HTTP (JSON) DataSource for Berserker the same way as CSV DataSource exists within Ranger project

vajda commented 6 years ago

@hemanthkilari The feature request you made is reasonable and something we already had in our mid-term plan, but unfortunately haven't had enough time to implement it. I hope your proposed implementation solves your problem and that you can build Berserker version with this ranger feature included and use it to solve your problem. However, proposed implementation does not reflect the approach to solution we want to have and we cannot accept your pull request as such. We will be able to implement it in next 2-3 months due to current lack of time. Or if you are interested, maybe we can guide your proposed solution to a point where we both are satisfied with the outcome. What do you think about it?

hemanthkilari commented 6 years ago

@vajda Let us discuss your approach. We will take decision based on outcome.

vajda commented 6 years ago

@hemanthkilari Ok, first of all name userDefinedValue is misleading, since this will handle only documents returning JSON. I'd suggest to call it fetchJson or getJson or something in that fashion. Let's discuss parameters:

  1. DataSourceName, I'd call it just source, classpath/http endpoint semantics is pretty good idea
  2. JsonPathQuery, it would return subtree of JSON response
  3. refreshIntervalInSeconds, although I understand necessity of this performance wise, this is something that I would not introduce on this level. I'd rather have some additional wrapper DSL type to handle caching of values in case value retrieval time is large enough. Something like: cached(fetchJson('http://x.y.z', '$.a.b'))
  4. regExpressionToFetchExpectedStringFromFetchedField This is also something that should be solved through additional transformer value type, similar to third param as it can benefit other value types as well (csv for example) As this will return subtree of JSON which effectively will be a map, get can be used for further extraction.

I'd like that we come up with common agreement on this conceptual level first, then we can discuss implementation approach and details.

hemanthkilari commented 6 years ago

@vajda I have no issues with your suggestions. The presented code was written for our need. I do not mind in changing it to satisfy Ranger specification. Provide me other details to proceed with implementation.

vajda commented 6 years ago

@hemanthkilari Well, change the name of function to fetchJson and value class to FetchJsonValue, remove the third and fourth parameter. Semantics for remaining two parameters (the first and the second) can be as you already proposed. There should not be e.printStackTrace(); in the code. In all the cases where checked exception needs to be caught, rethrow RuntimeException like:

catch (Exception e) {
            throw e instanceof RuntimeException ? (RuntimeException) e : new RuntimeException(e);
        }

Besides ValueExpressionParser which represents YAML API for Ranger, there is also BuilderMethods class which represents Java API. This class needs to provide method for creation of FetchJsonValue. Also tests should be added and documentation files updated (yaml-configuration.md and java-api.md)

I think that's all. When you make these changes, I'll do a code review on the pull request.

Once again, thanks for your interest in Ranger, your help is very much appreciated and welcome!