tdunning / log-synth

Generates more or less realistic log data for testing simple aggregation queries.
Apache License 2.0
257 stars 89 forks source link

Add new class for predefined arbitrary string distribution #3

Closed jbubier closed 11 years ago

jbubier commented 11 years ago

One suggested enhancement for this is to add a new class that provides a random distribution between a set of pre-defined strings. For example, UHC has one column in their tables "active_flg" to define whether a particular record/patient is active. This field is either "Y" or "N". It's not possible to match the filters in their query with the current string generation classes (address and name) without modifying the output data.

tdunning commented 11 years ago

Added a string generator:

[ {"name":"id", "class":"id"}, {"name":"name", "class":"name", "type":"first_last"}, {"name":"gender", "class":"string", "dist":{"MALE":0.5, "FEMALE":0.5, "OTHER":0.02}}, {"name":"address", "class":"address"}, {"name":"first_visit", "class":"date", "format":"MM/dd/yyyy"} ]