Overview

I came across a template data generator called Dummy JSON which has a very similar syntax to this project. I initially was going to request some tweaks to more closely mimic features of Dummy JSON (synchronized helpers, overrides, arrays), but it seemed cleaner to simply suggest this package have a method for specifying different templating dialects. The default dialect is the one already supported by the package; if a user takes no special actions to specify a template dialect other than the out of the box dialect, it behaves as if different dialects don't exist and simply uses the default.

How it would work

I propose at the top of the template file there be limited support for bash/python/yaml "#" style comments as the JSON format doesn't support comments normally. We can borrow from a convention called File Variables used by Emacs and others text editors normally used to denote file encoding and other prefered behaviors at the heading of a file, like so where key/value pairs are delimited by semi-colons...

# -*- mode: python-mode; coding: utf-8; python-indent-offset: 4 -*-

...instead of the previous example, the first line of the template file would look more like one of these if specified:

# -*- template-dialect: pvienneau/atom-template-json -*-
# -*- template-dialect: webroo/dummy-json -*-
# -*- template-dialect: acme/some-other-dialect -*-

...or an even more simply: # template-dialect: webroo/dummy-json

The header comments could be stripped after initial parsing to pass the data through to the respective dialect parser. This concept could be leveraged to include fields that specify includes/overrides for sample data via a file or even inline.

If the package included a command-line tool, in linux shells template files could be set executable in conjunction with Shebang Notation so that scripts with a heading like so...

#!/usr/bin/env atom-template-json 
# template-dialect: webroo/dummy-json

...could run from the command line like so:

$ ./user.template.json --count 2000 > small-sample-user-set.json

Examples

Example 'out-of-the-box' template format as it is today:

{
    "users": [
        {{repeat(5)}}
        {
            "id": {{id()}},
            "guid": {{guid()}},
            "description": {{string(50)}},
            "birth_year": {{random(1975, 2005)}},
            "date_created": {{timestamp()}}
        }
    ]
}

Example of support for a different dialect

# template-dialect: webroo/dummy-json
{
  "users": [
    {{#repeat 2}}
    {
      "id": {{@index}},
      "name": "{{firstName}} {{lastName}}",
      "work": "{{company}}",
      "email": "{{email}}",
      "dob": "{{date '1900' '2000' 'YYYY'}}",
      "address": "{{int 1 100}} {{street}}",
      "city": "{{city}}",
      "optedin": {{boolean}}
    }
    {{/repeat}}
  ],
  "images": [
    {{#repeat 3}}
    "img{{@index}}.png"
    {{/repeat}}
  ],
  "coordinates": {
    "x": {{float -50 50 '0.00'}},
    "y": {{float -25 25 '0.00'}}
  },
  "price": "${{int 0 99999 '0,0'}}"
}

Closing

I have some complex JSON data I need to model and generate schemas and data for and this package comes very close to being exactly what I need, but some of the features of the other data generator are attractive as well. Instead of borrowing/replacing features, perhaps without too much effort the package could support one or more other rendering formats as well (Dummy, Faker, Chance, Casual, RandExpJs). In any case, thank you for this already useful atom package!

If this package were to offer alternate JSON generator functions, they would need to be come pre-installed. My concern with allowing to specify a notation to define what dialect to use would be very cumbersome since every package may have their own implementation details when passing the raw input through, executing the transformation and receiving the output. Including a dropdown in the settings page to select what dialect to use would be a more stable choice in my opinion.

I'll mull over the idea of offering alternate dialects to use over the next week, I think that can be very beneficial.

The current dialect is very limiting, but I'm hoping to expand on it soon and improve its efficiency. Could you provide a list of some functionality that you would enjoy from having included?

I apologize it took so long to respond. I am modeling some complex object schemas in Elasticsearch, and elasticsearch allows you to index JSON objects that include nested JSON data.

Let's say I wanted to generate mock data to model an email object/document in an index for searching emails. Emails have things like message subject, body, participants (from/to/cc/bcc). I wanna generate data for bulk import to Elasticsearch, so some generated values like document id, first/last names might be used multiple times in same document to fake an email address, or whatnot. I almost need some sort of context for each generation loop so if I call the generator/helper method again, I get a cached value, or I have a way to call any generator and tell it to reuse the last generated value (unless its the first call, then generate a value).

Here is an example of one bulk entry needed to add a document to elasticsearch (there should only be two line delimited JSON objects, but I have pretty printed the second just so you can see in not in compact form):

{ "index":{"_index":"cp", "_type":"products", "_id": "55638835"}}
{
  "Id": "55638835",
  "Subject": "Product testing for refab status?",
  "Teaser": "Hey Carl, I writing because I haven't received an updat...",
  "DateTime": "2017-03-07T18:44:53+00:00",
  "Participants": [
    {
      "UserId": "66524",
      "Email": "bob.smith@acme.com",
      "FirstName": "Bob",
      "LastName": "Smith",
      "FullName": "Bob Smith",
      "roles": [ "from" ] 
    },
    {
      "UserId": "77364",
      "Email": "carl.mcman@acme.com",
      "FirstName": "Carl",
      "LastName": "McMan",
      "FullName": "Carl McMan",
      "roles": [ "to" ] 
    },
    {
      "UserId": "12536",
      "Email": "judy.ferrel@acme.com",
      "FirstName": "Judy",
      "LastName": "Ferrel",
      "FullName": "Judy Ferrel",
      "roles": [ "cc" ] 
    }
  ]
}

Notice in this above example, the document Id is referenced in two places, that first/last name is referenced at least 4 times each (for fields, for generation of email address, for generation of full name). Dummy JSON template has limited support for this concept, but if you call a generator/helper twice it updates the value, so it wouldn't work for me unless there was a param I could use to indicate to use cached value instead of generating a new value. Some things are related, like in Dummy JSON email address helper uses information cached from first/last name helpers.

It's somewhat beyond the scope for this discussion, but I am also trying to model relationships. To demonstrate that a particular user that an email refers to can be searched. If I had a separate template that generated a list of random users, it would be fantastic if there was a way to specify to template to load a users.json file and randomly select one of those users to populate the participants list, or whatever list/loop.

If templates themselves allowed for some code/saving things to variables that could be a work around, but the template probably becomes less elegant. Like maybe at the beginning of a loop a data/context object could be populated and then used by reference in the rest of the template, I dunno.

My case is very specific, other people may not need anything as complex. For the moment I am using Dummy JSON and hand fixing some stuff, but it's hard due to the complexity of our business logic / use cases. To generate the mock data we want, we'd probably need to write a custom NodeJS script for our needs, but I wanted to outline a use specific use case anyways, but it sounds like a fair amount of work. This might be something that is punted until there are more requests.

For the most part, this is where I'd like to advance this package to, being able to generate certain relatable values between keys (first name and last name to email) between iteration loops. This is an easy functionality to consolidate with the direction of this package since its implementation change little to nothing when it comes to using these functions.

On the other side, like you have mentioned, some of your needs are very specific, requiring a very fine control on the values. Beyond possibly offering a moderate level of control on the actual values, this package wouldn't be directed towards being as declarative to the values of values/relationships you'd be looking for.

Maybe at that point a full JS script buildling yourself an object that gets saved to file as JSON is a possible solution? You'd have better control over the values, sets and relationships that can be created. At the moment, this package wouldn't be aimed at going that far; it's more aimed towards offering you to control the data structure, with its values being secondary to this. At its current release, it is achieving this requirement, so I will be looking to add/review the value generation functions to implement some of your above mentioned functional requirements, which would improve the overall experience of using the generated JSON structure, but unfortunately falling short on offering 100% control on its generated values.

Yeah, I think for the specific cases we want to cover I think we'll probably need to roll a domain specific tool and maintain that. I still, of course, like the idea of a more general tool vs maintaining our own, but it's more likely we'll use something outside of Atom as part of build cycle anyways vs having Atom be part of production builds. I have been in the hospital for the last week, all better now, so I'll be revisiting this issue soon. Primarily we have data in existing JSON formats I need to transform to a different format and that's normally stages in a larger content transformation pipeline. Thank you for entertaining my request! I'm sure I will be thinking about JSON templating/mocking over the coming weeks, so I might pop back up, we'll see. :) Thank you for the time you spent mulling this! :) @sartian

I'm glad to hear that you are doing better. I have been starting to work on broadening the collection of generator functions that this plugin supports. I will give you a shout when I'm nearing a steady set of new functions as I'd be interested in getting your feedback. Most of them have come from our conversation above and from your code examples.

@sartian have a look at the colleciton-function-schema branch if you're interested, I've added a few new functions while trying to be more flexible as to how these functions can be used:

added firstName, lastName functions, alongside the fullName funciton
caching results for each ID so they can be reused within the same repeat cycle
new oneOf function, allowing you to push a collection of values (ints or strings) and it will randomly choose one
working on allowing return values of functions to be the inputs to other functions, work in progress.

I haven't had as much time as I would have liked over the last week, but hoping I can move forward with enough new functions to warrant a 0.1 or 1.0 release soon. Hopefully this will address more of your requirements.

pvienneau / atom-template-json