schmittjoh / serializer

Library for (de-)serializing data of any complexity (supports JSON, and XML)
http://jmsyst.com/libs/serializer
MIT License
2.32k stars 588 forks source link

Add tests that will show memory leaks. #1431

Closed scyzoryck closed 1 year ago

scyzoryck commented 2 years ago
Q A
Bug fix? no
New feature? no
Doc updated no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets Related to: https://github.com/schmittjoh/JMSSerializerBundle/issues/895
License MIT

Background:

First step to fix the issues is to make it visible - so let's add some tests that will visualize the issue.

As we can see in results - after 10000 iterations memory usage is rising from 1.524mb to :

Source of the issue:

PHP is not handling well circular dependencies between classes - it requires Garbage Collector the cleanup them. In result some objects are not cleaned up after serialization what might be source of the issue for long running processes that are using this library (for example in Consumers).

classDiagram
class Context {
    +VisitorInterface $visitor
    +GraphNavigatorInterface $navigator
}

class GraphNavigator {
    +VisitorInterface $visitor
    +Context $context
}

class AbstractVisitor {
    +GraphNavigatorInterface $navigator
}

GraphNavigator ..> Context
AbstractVisitor ..> Context

AbstractVisitor ..> GraphNavigator
Context ..> GraphNavigator

GraphNavigator ..> AbstractVisitor

Expected result

Memory usage should not rise when running serialization multiple time. I was able to achieve it with some PoC - however it needs some work to avoid BC breaks.

goetas commented 2 years ago

what happens if you call gc_collect_cycles? does the memory go down to the expected value?

scyzoryck commented 2 years ago

Run GC in each cycle

Code

for ($i = 0; $i <= $this->iterations; $i++) {
    $this->serializer->serialize($this->collection, $this->format, $this->createContext());
    gc_collect_cycles();
}

Result

+----------------------+----------+
| benchmark            | mem_peak |
+----------------------+----------+
| JsonMultipleRunBench | 1.524mb  |
| JsonSingleRunBench   | 1.524mb  |
+----------------------+----------+

Run GC at the end

Code

for ($i = 0; $i <= $this->iterations; $i++) {
    $this->serializer->serialize($this->collection, $this->format, $this->createContext());
}
gc_collect_cycles();

Result

+----------------------+----------+
| benchmark            | mem_peak |
+----------------------+----------+
| JsonMultipleRunBench | 21.114mb |
| JsonSingleRunBench   | 1.524mb  |
+----------------------+----------+
scyzoryck commented 1 year ago

Thanks @simPod for feedback. Fixed.

goetas commented 1 year ago

I understand the issue here... thanks for working on it... the issue is that the context has the graph navigator instance and vice versa, creating a circular reference... and it does not get collected until the PHP's garbage collector kick's in... but the PHP garbage collector is layz and it waits a lot before starting to do its job.

Would it be an option to unset the context from the graph navigator at the end of the serialization process? (or vice versa, unsetting it from the context?)

scyzoryck commented 1 year ago

Thanks for feedback. If I recall - I tried it, but didn't work. I will check it in upcoming days.

scyzoryck commented 1 year ago

@goetas you were right. :) I pushed branch with unsetting properties. It looks like we can improve it without BC! :)

+----------------------+----------+
| benchmark            | mem_peak |
+----------------------+----------+
| JsonMultipleRunBench | 3.525mb  |
| JsonSingleRunBench   | 3.523mb  |
+----------------------+----------+