schmittjoh / serializer

Library for (de-)serializing data of any complexity (supports JSON, and XML)
http://jmsyst.com/libs/serializer
MIT License
2.32k stars 588 forks source link

PoC - optimise memory usage for json serialisation. #1412

Closed scyzoryck closed 2 years ago

scyzoryck commented 2 years ago
Q A
Bug fix? no
New feature? no
Doc updated no
BC breaks? no
Deprecations? no
Tests pass? no
Fixed tickets #...
License MIT

Current issue:

JsonSerialiser is using 3 times more memory compared to XML serialisation:

+--------------------------------+--------------------+-----+------+-----+----------+--------+--------+
| benchmark                      | subject            | set | revs | its | mem_peak | mode   | rstdev |
+--------------------------------+--------------------+-----+------+-----+----------+--------+--------+
| JsonSerializationBench         | benchSerialization |     | 1    | 3   | 45.634mb | 4.692s | ±0.87% |
| XmlSerializationBench          | benchSerialization |     | 1    | 3   | 15.898mb | 7.0s   | ±1.23% |
| JsonMaxDepthSerializationBench | benchSerialization |     | 1    | 3   | 45.644mb | 5.578s | ±0.52% |
+--------------------------------+--------------------+-----+------+-----+----------+--------+--------+

Currently data is transformed to array structure first, next it is passed to json_encode method - so basically we need to have same data in 3 places (original object, array structure, json).

Possible fix:

We can try to limit amount of data passed to array by using JsonSerializable that provides nice way to prepare data in small batches. Possible cut points is objects or arrays - I've used array as it was simpler for PoC.

Results: - memory went down from 45MB to 13MB. It does not affected execution time.

+--------------------------------+--------------------+-----+------+-----+----------+--------+--------+
| benchmark                      | subject            | set | revs | its | mem_peak | mode   | rstdev |
+--------------------------------+--------------------+-----+------+-----+----------+--------+--------+
| JsonSerializationBench         | benchSerialization |     | 1    | 3   | 13.002mb | 3.735s | ±0.95% |
| XmlSerializationBench          | benchSerialization |     | 1    | 3   | 15.898mb | 5.555s | ±0.48% |
| JsonMaxDepthSerializationBench | benchSerialization |     | 1    | 3   | 13.012mb | 4.336s | ±0.61% |
+--------------------------------+--------------------+-----+------+-----+----------+--------+--------+

Issues that needs to be solved:

But, what is bit unexpected, the rest seems to be working nice. Can you see any potential pain points for such solution? Do you think that it is worth to invest more time to improve it?

Best, scyzoryck

scyzoryck commented 2 years ago

After the investigation it looks like the issue is more related to event dispatcher that is leaking memory. Closing this PR.