Very High Footprint on Serializing Numeric 2-Dimensional Array

ab-tools commented 8 years ago

I've tried to use AqlaSerializer for serializing a bigger 2-dimensional array of shorts simply like:

var heightMap = new short[7201, 21601];
// (filling array)

using (var fileStream = File.Create(@"path_to_file"))
    Serializer.Serialize(fileStream, heightMap);

With an array size of 7,201 * 21,601 = 155,548,801 elements of shorts with each element using 2 bytes I would have expected a file size of 311,097,602 bytes plus a little bit overhead for the serialization information. But I was shocked about the resulting file having 1,369,649,462 bytes which is more than 4 times as much as the stored data itself needs!

What's the reason for this extremely high AqlaSerializer footprint? Is there a way to improve this?

Of course, I could serialize/deserialize such a structure very easily myself without AqlaSerializer, but as this array is part of a bigger structure where I do use AqlaSerializer already, it would be great if I could serialize/deserialize everything with AqlaSerializer.

Best regards and thanks in advance Andreas

AqlaSolutions commented 8 years ago

Originally protobuf-net supports only one-dimensional arrays:

http://stackoverflow.com/questions/16174558/how-do-you-serialize-deserialize-jagged-nested-arrays-with-protobuf-net http://stackoverflow.com/questions/22023405/protobuf-with-multidimensional-array http://stackoverflow.com/questions/4090173/using-protobuf-net-how-to-deserialize-a-multi-dimensional-array

I didn't change this in AqlaSerializer.

You can try to use short[][] - an array of arrays. It will still have some overhead because each "sub-array" is serialized as reference-tracked object.
You can follow one of the answers on SO above.

Your example code even should throw NotSupportedException, I found this test in the project:

    [Test, ExpectedException(typeof(NotSupportedException))]
    public void TestMultidimArray()
    {
        MultiDim md = new MultiDim { Values = new int[1, 2] { { 3, 4 } } };
        Serializer.DeepClone(md);
    }

ab-tools commented 8 years ago

First thanks a lot for your very fast reply again, Vladyslav!

OK, I would be fine with the option of using short[][], but when I tried to serialize that I got this exception: "System.NotSupportedException: Nested or jagged lists and arrays are not supported"

It would be really great if AqlaSolutions could support that! Or can I easily add support for that again myself with a "surrogate" object?

Thanks for your great support again Andreas

AqlaSolutions commented 8 years ago

I checked the code. Array handling is done not for types but for members - strange decision. It seems to be quite difficult to add support for nested list/arrays and especially hard when there is a lot of "legacy" code which I have to dig. I don't think I have a time for such big work now. But you can always try to do it yourself and then submit PR.

Workarounds:

You can make your nested/jagged array to be a member of wrapper class and then add a surrogate for it (I don't think you can add a surrogate for nested/jagged array directly because it's not registered as a type). Then in surrogate you can convert your array to dictionary or somehow flatten it or serialize to byte[] manually.

Please look questions on SE which I linked above. There are some other workarounds.

ab-tools commented 8 years ago

OK, thank you! The workaround with a surrogate sounds like a good way as I can just serialize/deserialize it "manually" (using binary reader/writer) to a byte array very easily.

I tried that now, but got an "OutOfMemoryException" while serializing the resulting byte array (in my case about 300 MB). The reason is the method "ResizeAndFlushLeft" in the "BufferPool" class, in concrete these lines:

        int newLength = buffer.Length * 2;
        if (newLength < toFitAtLeastBytes) newLength = toFitAtLeastBytes;

        byte[] newBuffer = new byte[newLength];

First of all I don't exactly understand why there is a need to copy the byte array to another byte array at all, but even if that's needed, what's the reason that a buffer length double the size which is needed is used?

That's just too much memory usage then and will result in an "OutOfMemoryException" at the last of above lines in my test.

Best regards Andreas

AqlaSolutions commented 8 years ago

Are you trying to manually use ProtoWriter from your code? It's really a low-level class, there could be some protobuf-specific hacks. I would just use MemoryStream with BinaryWriter/BinaryReader for that.

AqlaSolutions commented 8 years ago

Or did I understand you wrong? You give a big byte array and AqlaSerializer can't deal with it? What's the size of your array?

ab-tools commented 8 years ago

No, I never would try that - just using normal MemoryStream with BinaryWriter/BinaryReader as you wrote.

But when AqlaSerializer tries to serialize the quite big byte array of about 300 MB (precisely 311,097,602 bytes) it crashes at the location where I wrote above when the buffer is initialized with the doubled size (which is then 622,195,204) of my byte array which should be serialized.

Just tried and it works when I compile for 64 bit, but it need to be a 32 bit process for some other compatibility reasons. But everything should work fine as long as AqlaSerializer does not build buffers which are bigger (double the size) than needed. ;-)

AqlaSolutions commented 8 years ago

So fixed it (I mean byte[] growing) but there is still quite big memory consumption on reading. I had to use FileStream instead of MemoryStream to make reading work in my test. Do you need the release ASAP or it can wait for some other changes to combine with?

ab-tools commented 8 years ago

Thanks a lot for this very fast fix!

No problem, no quick release needed - I can easily compile it myself, so you can "officially" include it with some other changes later.

Thanks again a lot for your great support!

AqlaSolutions commented 8 years ago

@ab-tools, I've just commited an improvement which may be helpful especially in your case. It delegates all the buffering stuff to the stream side (so no own buffer) which means now it can write directly to the FileStream without consuming memory! What's important is that the writing stream should support seeking and reading too.

I'll see later if there is something in reading that I can optimize the same way.

https://github.com/AqlaSolutions/AqlaSerializer/commit/89d1cad84cbf7e2b8532b4c533f4659f4a458268

ab-tools commented 8 years ago

That's great, thanks a lot!

Optimizing the reading side in the same way would be cool, too, because reading is more often needed than writing. So any performance improvement on that is very welcome. ;-)

ab-tools commented 8 years ago

Your latest source code version works without problems now with the big byte array, but unfortunately it does have an extreme performance impact on "normal data serialization": For another much smaller (in terms of resulting file size) object where I use AqlaSerializer in the same project as well, I saw a really extreme duration increase for serialization with the latest version.

The object I'm serializing is "only" around 65 MB in resulting file size, but includes probably around 1 Mio. object instances. With the release version 1.0.0.818 of AqlaSerializer it took about 30 sec. to serialize which is totally fine, but with the latest source code version it tooks almost forever (I stopped it after around 5 Min.)!

So obviously your stream changes seem to have a very strong negative impact on serialization of a lot object instances. Deserialization works still the same fast as before, but serialization is almost not usable anymore.

ab-tools commented 8 years ago

... just made a few more tests and personally I think the constant flushing of the resulting serialized stream in the new version is the reason for the extreme performance impact.

This makes totally sense when serializing just a few very big object instances, but is really bad for a high number of small object instances.

So maybe it would make sense to make this behavior optional by a settings flag? What do you think?

AqlaSolutions commented 8 years ago

@ab-tools I think your performance issues are caused not only by flush. When the serializer uses your disk stream directly in some cases it have to read and rewrite data multiple times which is very slow. The point here is that you either want best speed or save memory. When you have to write much data in one structure or nested object (like 1300mb blob) in 32-bit app it may be impossible to use memory buffering at all.

Check the last version. I added the new option TypeModel.ForceInMemoryBuffer (default true, the legacy behavior) which forces serializer to use memory buffer even when it can seek and read data from the stream.

I also removed unnecessary Flush calls.

Please check both options with ForceInMemoryBuffer = true and false and see if there are any improvements for "false" (even if you will stay with "true").

ab-tools commented 8 years ago

Great, thanks a lot for the quick fix - I will give it a try and let you know then!

ab-tools commented 8 years ago

Just gave it a try and with "false" there is not much performance gain, maybe a bit, but it's still so bad that it's not usable (I canceled it again as it would take too long to wait).

Then I tested with "true" and this much better again, but still I see a little worse performance than with the latest release version: with my own build ("release" mode by the way) it takes about 43 sec. to save my database file (about 66.5 MB with a lot of object instances), but with the release version it's about 10 sec. less than that.

I tested it 3 times one after the other and it just does not get faster - always a bit more than 40 sec.

Do you have any idea why my local build with the latest source code still is about 30 % slower than your latest release build?

Maybe you could provide a release version with the latest source code - just to be sure it's not related to my local build for whatever reasons.

AqlaSolutions commented 8 years ago

Could you compile it as 64-bit and compare release vs the current version serialization speed (only "true" option)? It would give me a thought if it's connected to the 32-bit-optimized buffer growing strategy. I expect that 64-bit version has nearly the same performance as in the last release.

Your app should be run as 64-bit process for this test.

I will also profile any tests if you can send me such.

ab-tools commented 8 years ago

Just did a test with a 64 bit build and did not see any performance change.

As I said it would be great if you could make a release build agian - I'm just not sure if my build maybe is somehow different. E. g. it seems you're using VS 2015 where I'm still using VS 2013. Therefore it would be good to test with a build you made just to be sure.

AqlaSolutions commented 8 years ago

did not see any performance change

Did you mean that 1) 32-bit works worse than release-32-bit but 64-bit works the same as 64-bit release or 2) 32-bit works the same as 64-bit - worse than release?

https://github.com/AqlaSolutions/AqlaSerializer/releases/tag/v.1.0.0.931-beta

ab-tools commented 8 years ago

I meant that 32 bit is the same as 64 bit - both compiled by me and both were about 10 sec. slower than the last release version.

But now I tested it again (several rounds) with your beta build - with same bad result: it's still about 10 sec. slower than version 1.0.0.818. So it obviously was not the problem of my build, but really due to the code change.

So you can't reproduce the performance impact?

Maybe it has something to do with the bigger graph I'm serializing here. You remember issue #1 - that's the database where I see this 30% performance reduction from version 1.0.0.818 to 1.0.0.931.

AqlaSolutions commented 8 years ago

Ok, thank you for reporting. I'll try to find through the tests the case where the performance impact will be reproducable but you can speed things up and send me an example from your code.

ab-tools commented 8 years ago

Thank you for your great support!

Unfortunately I can't send you the code as I would almost need to send you the whole program, but did you test it with such a graph database we had in issue #1?

There I also send you an example if I remember correctly and I'm just using the "work-around" you suggested - starting a new thread with bigger stack size like this:

new Thread(s => Serialize(), 1024 * 1024 * 50).Start();

That works really without problems, just that with version 1.0.0.931 it got 30 % slower.

If you have any update I should give a try, no problem - I'm happy to help finding the problem.

AqlaSolutions commented 8 years ago

I fixed it, you may check the last commit; had to rollback some changes because the legacy code was very low-level optimized. You may set TypeModel.AllowStreamRewriting = false if you have speed issues but I believe now the performance difference won't be noticeable except some corner cases (very big monolithic objects like that blob - but you'd want more to save memory in that case).

ab-tools commented 8 years ago

Sorry for the delay, just had too much work today.

Just gave it a try now and it really works very well with the latest version now: I saw absolutely no speed difference anymore with the latest source code version to version 1.0.0.818 (run old version and new version two times each and it always were +/- 0.5 sec.).

Also the "big single byte object" still works without problems.

So this version is perfect now, thanks a lot for your great support again! :-) I guess it makes sense to release this a new version now as it's definitely a big improvement..

AqlaSolutions / AqlaSerializer

Very High Footprint on Serializing Numeric 2-Dimensional Array #3