andykais / json-querying-performance-testing

test performance of npm packages that can query json with a string
15 stars 6 forks source link

`sf-city-lots-json` #2

Open brettz9 opened 4 years ago

brettz9 commented 4 years ago

Hi,

A few things I came across when attempting to install (so I could run comparisons against different versions/PRs of our jsonpath-plus).

  1. ~Though you reference https://github.com/zemirco/sf-city-lots-json in the README, if you are not including it as dependency because of the file size, I might suggest giving instructions, or better yet, a Node build routine, so others can create the specific files created that your script expects, small-citylots.json, medium-citylots.json, and citylots.json, I only see one file in that repo.~ I see now this gets included when I load in Sourcetree, but on the CLI, do I need special arguments to get the LFS download to work?
  2. Were you planning to publish on npm? I would think it could be useful there. If not, maybe you want to set "private": true on package.json to communicate your intent. But I think it'd be useful to have the trust inherent in versioning apply to a testing framework.
  3. I'd suggest allowing testLibraries/performanceClasses, testMethods, and datasets, to be supplied dynamically through CLI argument. (Actually, it be particularly cool if you could define a bin so one could install your project globally (or as a devDep.) and run an executable which picked up the working directory and could use that dynamically as a querier (performance class), so there'd be no need to change in source the require path or the installed version of a given querier).

Anyways, let me know how I might be able to help, though I think I'd need guidance on #1 to be able to prepare a PR for #3.

andykais commented 4 years ago

hey @brettz9, thanks for the feedback. Let me address each in order:

  1. I believe git clone git@github.com:andykais/json-querying-performance-testing.git will also download the LFS files, that is how Ive always used this, but maybe youre using an older version of git?
  2. This was largely a repo used to show the benchmarks to a few of the json querying libraries for comparison (and hopefully encourage them to improve their speeds). I dont think its particularly useful on npm as it stands. Its a little too specific
  3. To address your third point, creating a generic command line utility is a tad tricky because each json query library varies slightly in its DSL. At best, I could create a CLI that takes in a config like so:
    {
    "expectedResult": ["berries", "apples"],
    "queriers": [
    {
      "package": "jsonata",
      "query": "food.fruits"
    },
    {
      "package": "jsonpath-plus",
      "query": "$.food.fruits"
    }
    ]
    }

    and is used this way:

    json-query-perf config.json dataset.json

do you think that would be useful?

brettz9 commented 4 years ago

# 1. I am on a slow connection, and maybe I had aborted or timed out, and was still left with the rest of the files. So, I think it is ok now.

# 3. Yes, I think that would be useful. However, FWIW, I am mostly interested in using your existing tests and being able to add json-querying-performance-testing to our devDependencies so I can just point our contributors to an npm script to allow them to see how their changes are impacting performance.

And sorry I am only offering ideas here rather than PRs, as I'm very tied up at the moment with a variety of other projects, but hoping we can add your helpful work to our pipeline.

andykais commented 4 years ago

Im glad the files are downloading now. In regards to its usage as an npm package, could you give a bit more context? I still am shaping how this repo is used, so I am interested in the use cases. Are the contributors in question working on a JSON querying library? That seems to be the only reason someone would want to continuously run this library.

I am starting to think about this library as a sort of 'test runner' but perhaps because of how specific the use cases for these libraries are, it is worth having a list of curated tests that is maintained in the repo. Maybe we could reach a middle ground. I could keep the cli usage described above, but by default it will run a builtin config against the existing datasets (which would also be included in the package)

brettz9 commented 4 years ago

As far as the middle road, maybe there could be a separate repo for the test runner and let this be refactored to use that, thereby not making the test runner larger while still avoiding duplication?

As far as context for how we're using or would like to use, I'd just like to be able to add something like this in our package.json:

{
    "scripts": {
        "check-json-performance": "json-querying-performance-testing",
        "test": "mocha && npm run check-json-performance"
    }
}

Then whenever we ran npm test, we'd see how well our library was currently doing against all the tests, e.g., after trying a new branch of our jsonpath-plus.

And if your CLI accepted arguments, we could pass them on to the npm script like this while within our jsonpath-plus folder:

npm run check-json-performance -- --datasets=small-citylots.json --testMethods=shallow,conditional --testLibraries=copy-of-JsonPathPlusCity-querier-but-requiring-our-own-current-local-library.js

...or we could create our own npm scripts with the arguments built-in

{
    "scripts": {
        "check-json-performance": "json-querying-performance-testing",
        "perf-small-set": "npm run check-json-performance -- --datasets=small-citylots.json --testMethods=shallow,conditional --testLibraries=copy-of-JsonPathPlusCity-querier-but-requiring-our-own-current-local-library.js"
    }
}

...then just run npm run perf-small-set.