schmittjoh / serializer

Library for (de-)serializing data of any complexity (supports JSON, and XML)
http://jmsyst.com/libs/serializer
MIT License
2.31k stars 591 forks source link

Adding tideways spans inside JMS serializer #1524

Closed Aliance closed 7 months ago

Aliance commented 8 months ago
Q A
Bug report? no
Feature request? no
BC Break report? no
RFC? no

The problem

Hi @goetas and everyone! I am using JMS serializer (via Symfony bundle) and have a problem with serializing too big data. I have profiled it with Tideways and it shows the problem inside JMS for my case: Снимок экрана 2023-11-02 в 5 12 14 PM

But when I see the profiling itself, it does not automatically creates spans for al internal usages of JMS libraries and so on, so the timeline looks like this:

Снимок экрана 2023-11-03 в 11 13 04 AM

So without the callgraph I can not quickly understand that's the problem on the serialization side.

One of the suitable option for that is using \Tideways\Profiler::watch: https://support.tideways.com/documentation/reference/php-extension/custom-timespans.html#watching-function-calls

I am trying to use that code:

if (class_exists('Tideways\Profiler')) {
    TidewaysProfiler::watch('\JMS\Serializer\Serializer::serialize');
}

But it does not works for me.


Question

Is there any way to create a wrapper for explicitly calling Tideways profiling? Do you have any advices for me?


PS: taking a look on №5 https://tideways.com/profiler/blog/5-ways-to-optimize-symfony-baseline-performance

mbabker commented 8 months ago

Not a Tideways user, but...

Does Tideways profile any other vendor code, or is it just this library you're not seeing in your time graph?

Have you tried decorating the serializer service to add your Tideways profiler calls in that way?

Are you using the metadata cache? That'll keep the serializer from reading class metadata (i.e. parsing annotations or reading XML files) at runtime.

Are you using any strategies to limit the data that's being returned? I'm going to assume your getStatisticsAction() is returning a collection of objects. Is it filtered in any way (i.e. pagination)? Do the objects being serialized have nested properties that might lead to recursive serialization in different paths (i.e. two statistics have a relation to the same entity, and that related entity is being serialized as part of both statistics)? Without getting into profiling the serializer itself, it may be helpful to look at your own application configuration and try to limit how much data is being serialized as a potential improvement.

Aliance commented 8 months ago

Does Tideways profile any other vendor code, or is it just this library you're not seeing in your time graph?

It is a PHP extension/module, that profiles all of the application.

Have you tried decorating the serializer service to add your Tideways profiler calls in that way?

I wanted to try but did not do it yet. Do you have any suggestions on how it should look without a lot of changes in the legacy code?

Are you using any strategies to limit the data that's being returned? I'm going to assume your getStatisticsAction() is returning a collection of objects. Is it filtered in any way (i.e. pagination)? Do the objects being serialized have nested properties that might lead to recursive serialization in different paths (i.e. two statistics have a relation to the same entity, and that related entity is being serialized as part of both statistics)? Without getting into profiling the serializer itself, it may be helpful to look at your own application configuration and try to limit how much data is being serialized as a potential improvement.

I am trying to serialize big JSON with only one level nesting. Yes, it's big. No, it does not have any pagination due to some reasons. But still, for now it is as is.

mbabker commented 8 months ago

Does Tideways profile any other vendor code, or is it just this library you're not seeing in your time graph?

It is a PHP extension/module, that profiles all of the application.

I meant is Tideways showing you call graphs or time spans or memory use from calls to other vendor code in your application, or is just the serializer calls that are excluded?

Have you tried decorating the serializer service to add your Tideways profiler calls in that way?

I wanted to try but did not do it yet. Do you have any suggestions on how it should look without a lot of changes in the legacy code?

You could do it with a compiler pass:

<?php
use App\Serializer\TidewaysSerializer;
use Symfony\Component\DependencyInjection\Compiler\CompilerPassInterface;
use Symfony\Component\DependencyInjection\ContainerBuilder;

final class DecorateJmsSerializerPass implements CompilerPassInterface
{
    public function process(ContainerBuilder $container): void
    {
        if ($container->hasDefinition('jms_serializer.serializer')) {
            $container->register('jms_serializer.serializer.tideways_decorator', TidewaysSerializer::class)
                ->setDecoratedService('jms_serializer.serializer');
        }
}

(Might need to double check this against your Symfony version, but the gist of it should be valid)

Then the decorator looks something like this:

<?php
namespace App\Serializer;
use JMS\Serializer\ArrayTransformerInterface;
use JMS\Serializer\SerializerInterface;

final class TidewaysSerializer implements SerializerInterface, ArrayTransformerInterface
{
    /** @var SerializerInterface&ArrayTransformerInterface **/
    private $decoratedSerializer;

    public function __construct($decoratedSerializer)
    {
        $this->decoratedSerializer = $decoratedSerializer;
    }

    public function serialize($data, string $format, ?SerializationContext $context = null, ?string $type = null): string
    {
        // Tideways stuff first
        return $this->decoratedSerializer->serialize($data, $format, $context, $type);
    }

    public function deserialize(string $data, string $type, string $format, ?DeserializationContext $context = null)
    {
        // Tideways stuff first
        return $this->decoratedSerializer->deserialize($data, $type, $format, $context);
    }

    public function toArray($data, ?SerializationContext $context = null, ?string $type = null): array
    {
        // Tideways stuff first
        return $this->decoratedSerializer->toArray($data, $context, $type);
    }

    public function fromArray(array $data, string $type, ?DeserializationContext $context = null)
    {
        // Tideways stuff first
        return $this->decoratedSerializer->deserialize($data, $type, $context);
    }
}
scyzoryck commented 8 months ago

You can also use event system to add something for each object that is serialised. See: https://github.com/schmittjoh/serializer/blob/master/doc/event_system.rst

If you are talking about legacy systems- what version of JMS are you using? There is huge performance improvement between v1 and v2 - with few breaking changes. In version v3 there were also some fixes to memory leaks. From other improvements - writing custom handlers for your objects might improve performance with minimum effort.

Best scyzoryck.

Aliance commented 8 months ago

I am using jms/serializer version 3.18.2 with jms/serializer-bundle version 3.10.0

For now, I am satisfied with a serializer decorator from @mbabker comment, thanks a lot. Will take a look further for some statistics.