lucagez / slow-json-stringify

The slowest stringifier in the known universe. Just kidding, it's the fastest (:
https://github.com/lucagez/slow-json-stringify
MIT License
468 stars 17 forks source link

Support for top level arrays #18

Open markmcdowell opened 3 years ago

markmcdowell commented 3 years ago

Hi great library, nice performance.

Does it support stringifying a top level array?

I can't see a way to define it in the schema, everything seems to be an object. Equivalent in json schema would be something like:

{
    "type": "array",
    "items": {
        "properties": {
            "name": { "type": "string" },
            "lastname": { "type": "string" }
        },
        "required": ["name", "lastname"]
    }
}

Thanks!

lucagez commented 3 years ago

Hi @markmcdowell 👋 Arrais are not supported at top-level inside the core functionality of sjs. This is a design decision, as all the computations performed by sjs happens in-memory. Serializing arrays happens usually when dealing with lots of data. Misusing the library and consuming way too much memory is somewhat prevented whis way.

Anyway, serializing arrais imply working with the same structure, many times. We can therefore decompose the problem in smaller pieces.

I threw together a small node server with an example of serializing a huge (top-level) array using sjs and a cusom readable stream. Comparing the approach with the in-memory variant.

const http = require('http');
const { sjs, attr } = require('slow-json-stringify');
const { Readable } = require('stream');

const serializer = sjs({
  hello: attr('string'),
});

class ReadableSjs extends Readable {
  constructor(num) {
    super();
    this.counter = 0;
    this.num = num;
  }

  _read() {
    if (this.counter === 0) {
      this.push(Buffer.from('['));
    }

    if (this.counter > this.num) {
      this.push(Buffer.from(']'));
      this.push(null);
      return;
    }

    /**
     * Push a serialized chunk
     */
    this.push(Buffer.from(serializer({ hello: 'world' })));

    if (this.counter < this.num) {
      this.push(Buffer.from(','));
    }

    this.counter++;
  }
}

const server = http.createServer((req, res) => {

  /**
   * This is going to crash the default allocated heap
   */
  if (req.url === '/in-memory') {
    const memoryHog = Array(1e9)
      .fill(0)
      .map(() => serializer({ hello: 'world' }))
      .join(',');

    res.end('[' + memoryHog + ']');
    return;
  }

  if (req.url === '/stream') {
    const readable = new ReadableSjs(1e9);

    readable.pipe(res);
    return;
  }

  res.end('Not Found');
});

server.listen(4000);

Could this example already fullfill your use case? Do you think the library could benefit from such functionality?

Btw, it could be added as a separate utility

markmcdowell commented 3 years ago

Thanks! that's really interesting, let me do some tests with real world data.

lucagez commented 3 years ago

Thanks! that's really interesting, let me do some tests with real world data.

@markmcdowell any luck in testing?

markmcdowell commented 3 years ago

H!, sorry yes I did do some testing but a large part of our code base uses a database connector that returns whole arrays and doesn't support streaming. So unfortunately closer to the memory hog example above and the performance is poor compared with a JSON.stringify