Closed ab-tools closed 7 years ago
Hi, the only thing I can suspect is BufferPool.cs related to byte[] allocation:
// try growing, else match
int newLength = (Is32Bit && toFitAtLeastBytes > 1024 * 1024 * 50)
// in 32bit when > 50mb grow only linearly
? (toFitAtLeastBytes + 1024 * 1024 * 10)
// in 64bit grow twice and add 10 kb extra if not enough
: Math.Max(buffer.Length * 2, toFitAtLeastBytes + 1024 * 10);
Please use a profiler.
First thanks a lot for your extremely quick reply! :-)
Good point, for testing I could just change that line to always use the "32 bit" way and see if that helps, right?
That's easier as a first test than using a profiler on the customer machine. ;-)
I just send a changed version (Is32Bit
forced to true
) to one of the customers who sees this issue:
I will let you know once I got a feedback if this change made any difference.
Really annoying if you cannot reproduce a problem yourself and need to wait for customer feedback...
Thanks again so far! Andreas
Just got feedback from the customer: Unfortunately the small code change made no difference. Still 10-15 min. database loading time while in 64 bit mode.
Do you have an idea how to profile that easily on the customer computer? Any tool you could recommend here?
Thanks again a lot for your support Andreas
I prefer to use JetBrains dotTrace, they have a trial version so you should be able to run it there. Or may be even connect remotely. Good luck!
Thanks for the suggestion!
I will try to get on the client computer via TeamViewer or so to collect some data with dotTrace...
Just had the chance to access the customer PC via TeamViewer and recording some dotTrace samples here: https://www.fs-flightcontrol.com/download/temp/FS-FlightControl-dotTrace.zip
The "Sampling" and "Tracing" dumps contain only the first 10 seconds or so of the database loading process, for the "Tracing" dump I really waited the full time (I think is was 7-8 min. or so) till the database loading has been completed in 64 bit mode.
Please note that the AquaSerializer version used is not your very latest code, but the version after your commits from May 6, 2016: If I remember correctly there were some compatibility issues with some code changes after this date which prevented us from updating. But please correct me if I'm wrong and we should be able to update keeping read-compatibility to existent databases.
Thanks a lot in advance for any ideas on this strange issue! Andreas
I checked the samples, it looks like GC takes very long time: 9.6 of 14.8 seconds. There may be some unnecessary allocations. https://www.screencast.com/t/cYXoG8wbe4
There is a feature in dotTrace called "Compare Snapshots" which could be usable here if you profile it again in 32bit mode (sampling, timeline, tracing). Also to track the memory allocation issue I suggest you to capture dotMemory samples too (both for 64bit and 32bit).
I doubt that BasicList.Add
method called by NetObjectCache.SetKeyedObject
behaves differently on 64bit though - https://www.screencast.com/t/rouPIwU4IyYS
Note that you may improve performance by precompilling serialization dlls (RuntimeTypeModel.Compile) instead of using AutoCompile
on the fly which takes 5.5 seconds https://www.screencast.com/t/4Wd1Boj9
First thanks a lot for your analyzing!
OK, I will then re-record the dumps on the client computer in 32 bit to be able to compare them. Additionally I will get some memory dumps as well.
So, now I captured the 3 dotTrace sampling variants both for 64 bit and 32 bit: https://www.fs-flightcontrol.com/download/temp/FS-FlightControl-Dumps.zip
Additionally also 1 memory snapshot while running in 64 bit is included.
I hope this sheds some light on this strange issue. :-)
Thanks again for your support! Andreas
Please add also the memory profiling results for 32bit as I asked previously. I need to compare which objects are allocated more in 64bit.
Snapshots look somewhat differently (due to optimizations) but I can't blame any exact thing. 64 just needs more time for GC. May be GC has some different settings or may be the app swaps to pagefile.
Can you perform profiling with Model.AutoCompile = false
? It will work slower but should disable some optimizations and exclude compilation stuff from results so they provide more useful data. Also I want to check whether the execution time is still so different when you disable compilation.
Also I have a suspect that ResourceManager
called from Control.OnPaint
produces much garbage. May be in 32bit it works better. Consider caching it. https://www.screencast.com/t/pCVbQZse793
Interesting that the ResourceManager
wastes so much resources here:
Although I don't think that's the problem as the splash screen using this ResourceManager
call frequently is shown also while other things are loaded on program start, but only the database load is delayed so much in 64 bit, I still cached the image which is loaded here now. So at least this "garbage" should be saved now. :-)
OK, I try to collect some more data according what you wrote:
Is it enough to collect the data with Model.AutoCompile = false
just for one of the dotTrace variants?
I mean either "Sampling", "Tracing" or "Timeline" or do you really need all of them again?
Just asking as it takes so long to collect just one of them in 64 bit due to the long loading time and I want to keep it as short as possible to keep the customer willing to help finding this issue. ;-)
Thanks again! Andreas
You can skip only 1 - "Sampling" - because "Tracing" replaces it but I need "Timeline" anyway to find memory allocations.
I think we really found the problem - it's the auto compilation!
As you requested I've tested now with Model.AutoCompile = false
and then it start more or less in the same time in 64 Bit as in 32 Bit. I verified that several times switching back and forth with the EXE files (with and without AutoCompile
and I could reproduce it everything time).
Still I recorded all the snapshots you requested as it might still be interesting to find out why AutoCompile
makes this problem.
I'm currently waiting for the snapshots to be transferred: Depending on the time it needs it might get tomorrow morning till I post the link here.
Still I wanted to let you know already that it obviously has something to do with the AutoCompile
option. :-)
@ab-tools, for AutoCompile = false
I don't think that those snapshots may show anything useful if the issue was not reproduced with it. At least now we know that it's somehow related to the compilation. But I still need the 32bit memory snapshot without AutoCompile = false
change. btw how long did it take to perform deserialization with AutoCompile = false
?
The fast solution for you would be to pre-generate serialization dlls with Model.Compile(filename)
on your developer machine and Reflection-load the generated model (to avoid circular dependency) - this ensures the same compiled serialization code on any architecture - but I still would like to fix this in the serializer project. So could you please run these tests on the customer machine:
Model.Add
and then manually call Model.Compile()
without file name (before using the model) - will that model be still slower on 64bit? It should work the same as auto compilation but if it's not it may point the cause.Model.Compile
with file name and, if 32/64bit models work with different speeds, send me the generated 64bit and 32bit dlls for comparison along with their dotTrace profiling snapshots.In both cases you need to actually use the model returned by Compile
method, not your original one.
Thanks for your quick reply again!
btw how long did it take to perform deserialization with AutoCompile = false ?
The customer computer is quite slow (only an old Core2Duo), so it took about 10 sec. or so, but still A LOT quicker then the about 10 min. with AutoCompile
enabled. :-)
I just uploaded the new dumps here:
https://www.fs-flightcontrol.com/download/temp/FS-FlightControl-Dumps2.zip
This also includes the requested 32 Bit memory dump with AutoCompile
(= without AutoCompile = false
).
I will prepare the two tests you mentioned and ask the customer to run them. As soon as I have a result I will let you know!
Just trying to prepare the test versions for the customer.
Maybe stupid question, but I never used this pre-compiled assemblies yet: Do I really have to add ALL possible types (including inherited sub-types) to the model manually? I mean that are quite a lot classes to add in this case...
Till now I only registered the required surrogates on type model, nothing else.
And the parameter applyDefaultBehaviourIfNew
needs to be true
I guess to have the same result as currently with the AutoCompile
option, right?
Right, the default behavior. But you don't need to add them manually.
Just export once the state of your model after deserializing your data (which auto-adds types) - model.Serialize(model.ExportTypeRelations())
and embed it with your app as a byte[]
resource or a file.
Then instead of adding types modify your code to use model.ImportTypeRelations(model.Deserialize(...), true)
- all types should be added the same way as they were registered in the original model.
The pseudo code:
var model = TypeModel.Create();
...register surrogates...
model.ImportTypeRelations(...);
var compiled = model.Compile();
var result = compiled.Deserialize(...);
Depending on your situation you may need to reorder the surrogates registration after ImportTypeRelations
.
More https://github.com/AqlaSolutions/AqlaSerializer/wiki/Batch-types-registration
Can you also check it with CompileInPlace
? The same way using ImportTypeRelations
.
OK, I see that's still a bit more to test again, so it's probably better I'll do that via TeamViewer on the customer computer again myself: Unfortunately the customer always has only time at the evening and I don't have time today and tomorrow evening. I therefore try to get a date with the customer somewhen this weekend.
Just wanted to let you know that you don't think I forgot it when I don't reply quickly again: I expect that I have a feedback somewhen on the weekend at the latest!
Just wanted to prepare the tests for the customer already, but when trying to so something like
using (var memoryStream = new MemoryStream())
{
typeModel.Serialize(memoryStream, typeModel.ExportTypeRelations());
}
it seems that the TypeModel
object does not have a function ExportTypeRelations()
.
How can I access this?
Best regards Andreas
@ab-tools RuntimeTypeModel has
Ah, I see, thanks!
Unfortunately I have still a lot problems getting the model.Compile()
to work:
First all our database classes are internal
which the compile method obviously does not like. So I changed all about 100 model classes and enum types to public
.
Then the compile function worked fine, but it could no longer read the existent database.
So I thought I'm completely rebuilding the database with this "new" type model, but also the rebuilt database cannot be read afterwards again resulting in this exception upon deserialization:
AqlaSerializer.ProtoException: Group not read entirely or other group end problem
at AqlaSerializer.ProtoReader.EndSubItem(SubItemToken token, Boolean skipToEnd, ProtoReader reader) in C:\Austausch\#FS\libs\AqlaSerializer-v2\AqlaSerializer\protobuf-net\ProtoReader.cs:line 790
at AqlaSerializer.RootHelpers.ReadOwnFooter(Boolean seeking, Int32 formatVersion, ProtoReader source) in C:\Austausch\#FS\libs\AqlaSerializer-v2\AqlaSerializer\protobuf-net\RootHelpers.cs:line 104
at b49e09a7-92e8-4267-9d84-008d9252533e.ReadRoot(Database , ProtoReader )
at b49e09a7-92e8-4267-9d84-008d9252533e.Deserialize(Int32 , Object , ProtoReader , Boolean )
at AqlaSerializer.Meta.TypeModel.DeserializeCore(ProtoReader reader, Type type, Object value, Boolean noAutoCreate, Boolean isRoot) in C:\Austausch\#FS\libs\AqlaSerializer-v2\AqlaSerializer\protobuf-net\Meta\TypeModel.cs:line 805
at AqlaSerializer.Meta.TypeModel.Deserialize(Stream source, Object value, Type type, Int32 length, SerializationContext context) in C:\Austausch\#FS\libs\AqlaSerializer-v2\AqlaSerializer\protobuf-net\Meta\TypeModel.cs:line 764
at AqlaSerializer.Meta.TypeModel.Deserialize[T](Stream source) in C:\Austausch\#FS\libs\AqlaSerializer-v2\AqlaSerializer\protobuf-net\Meta\TypeModel.cs:line 660
at FlightControl.Helper.DatabaseHelper.loadDatabase(String databaseFileName, String& errorText, Boolean& navigationDataReloadRequired) in C:\Austausch\#FS\FS-Control\Helper\DatabaseHelper.cs:line 57
All types were added before as you suggested by:
using (var memoryStream = new MemoryStream(typeModelData))
{
var modelData = typeModel.Deserialize<ModelTypeRelationsData>(memoryStream);
foreach (var typeData in modelData.Types)
{
typeModel.ImportTypeRelations(typeData);
}
}
This also seemed to work fine - at least there was no exception, but it just fails upon deserialization when I called model.Compile()
before. On the other hand deserialization works just fine when I do not call this method and keep AutoCompile
enabled.
Do you have any idea what I need to do to get the database deserialized again after model.Compile()
is called?
I'm running out of ideas here...
Tomorrow afternoon a would have another chance to test something on the customer computer.
Thanks Andreas
It's hard to say anything from the code you provided.
foreach
for ImportTypeRelations
, it accepts ModelTypeRelationsData
directly as an argument and also you should pass true
as a second parameter.Compile()
and then using the same instance for serialization (not the one returned by the Compile()
call).Serialize
call using the same model instance (of course you need to rewind the stream).Import/ExportTypeRelations
is using Add(Assembly)
or Add(Type[])
method but this has a high chance to require rebuilding your database (because of adding subtypes in different order). These methods add all supported types e.g. marked with attributes.Thanks for your quick reaction again!
Your point 1 was the problem:
After I used ModelTypeRelationsData
directly with the second argument being true
it worked!
Sorry, not sure why I missed this function overload before.
OK, as I already mentioned, I should have tomorrow afternoon the next chance to do tests on the customer computer: Then I will do all the test you wrote above.
So, now I did all the tests on the customer PC that you requested with following results:
// default, no type relations import
return typeModel;
=> very slow in 64 Bit (291 sec.), fast in 32 Bit (10 sec.).
// import type relations first
typeModel.CompileInPlace();
return typeModel;
=> very slow in 64 Bit (370 sec.), fast in 32 Bit (9 sec.).
// import type relations first
return typeModel.Compile();
=> very slow in 64 Bit (359 sec.), fast in 32 Bit (8 sec.).
// import type relations first
return typeModel.Compile("FlightControlModel", "FlightControlModel.dll");
=> very slow in 64 Bit (297 sec.), fast in 32 Bit (8 sec.).
// default, no type relations import
typeModel.AutoCompile = false;
return typeModel;
=> much faster in 64 Bit than the others (39 sec.), but 32 Bit seems still a bit faster (16 sec.).
As I had time differences +/- 60 sec. in the slow 64 Bit tests anyway the slightly faster database load of typeModel.Compile
when used with parameters to write the assemble should not be considered. All 64 Bit tests were basically "the same extremely slow" except the one using typeModel.AutoCompile = false;
.
Please also find attached the two type model DLLs as requested.
I still have access to the customer computer right now (for the next hour or so): So if you would have another idea what to test, I could do so directly now...
Thanks for testing it. I checked the dlls you sent me and they are 100% identical except GUID field. The same code just runs with different speed.
(TypeModel)Activator.CreateInstance( Assembly.Load("FlightControlModel.dll").GetType("FlightControlModel"))
.Compile()
calls into timings). If only Compile()
call is slow you can just distribute pre-compiled model dlls within your app.Compile()
call only.@ab-tools if the reason is JIT compiler this could be solved by updating .NET framework + latest patches. There is an issue on how x64 JIT handles long methods https://connect.microsoft.com/VisualStudio/feedback/details/508748/memory-consumption-alot-higher-on-x64-for-xslcompiledtransform-transform-then-on-x86 but since the issue is not reproducable on your machine it could be already fixed in .NET
First, the x64 JIT compiler has a few algorithms that are quadratically scaling. One of them is the debug info generator, unfortunately. So for very large methods, it really gets out of control.
OK, I will try to get access to the customer PC for another test round. ;-)
as I understand you have been including Compile() calls into timings
In fact not:
I just checked and the time I have written in the protocol above is really only opening the database FileStream
and calling the Deserialize
method - nothing else. The compilation of the type model has been done before.
@ab-tools if the reason is JIT compiler this could be solved by updating .NET framework + latest patches. There is an issue on how x64 JIT handles long methods https://connect.microsoft.com/VisualStudio/feedback/details/508748/memory-consumption-alot-higher-on-x64-for-xslcompiledtransform-transform-then-on-x86 but since the issue is not reproducable on your machine it could be already fixed in .NET
What's the best (and reliable) way to check the .NET 4.0 version number? Then I could check on the customer PC which version he has and compare it to the version I have here.
@ab-tools may be this https://docs.microsoft.com/en-us/dotnet/framework/migration-guide/how-to-determine-which-net-framework-updates-are-installed , compare the outputs on your and your customer machines. Or just ensure that windows update works properly.
OK, I will check that.
I just to be sure I got the "cold" and "hot" start right: You mean the difference if it loads in 64 bit the first time and the following times, right?
Here I think I can directly say that this does not have a considerable influence: As I mentioned in the test protocol above already, I had different loading times with exactly the same assembly of +/- 60 sec. directly one after the other anyway. Also the very first time I tried it yesterday did not to have a bigger difference than this time in one or the other direction.
Hi guys, If the issue is not reproducible on another computer, it's definitely related to software configuration or hardware. I would recommend trying to use this wonderful tool https://www.microsoft.com/en-us/download/details.aspx?id=30135
During the previous test It was not exactly the same assembly, those assemblies were produced by different compilation calls so JIT-ed each time.
Also the very first time I tried it yesterday did not to have a bigger difference than this time in one or the other direction.
Do you mean that writes are same slow?
Thanks, @aienabled
If the issue is not reproducible on another computer, it's definitely related to software configuration or hardware. I would recommend trying to use this wonderful tool https://www.microsoft.com/en-us/download/details.aspx?id=30135
Thanks, @aienabled, for helping. :-)
It is till now reproducible on about 3-4 customer computers I know of. But good point, I will try the .NET repair tool you mentioned on the customer PC.
Do you mean that writes are same slow?
@AqlaSolutions, Why writing? On the customer PC I only tried loading the database (which is the much more important case as done on every program start), no writing it.
@ab-tools, if the issue is reproducible on other computers than it's more likely to be a software issue than hardware. Other than updating and repairing .NET Framework I would also recommend looking into the system monitor to check if there are any background applications which could affect the performance. It could be an antivirus software.
@ab-tools in the
Also the very first time I tried it yesterday did not to have a bigger difference than this time in one or the other direction.
what do you mean by:
did not to have a bigger difference than this time in one or the other direction
@aienabled: Good point, but there was no anti-virus or firewall program running - that's always the first thing I check. :-) But I will create a SystemInfo dump the next time I'm on this customer PC to check the running processes in more detail.
@AqlaSolutions: Sorry, this sentence was really a bit confusing. I just wanted to say that between the very first time I ran a test on the client PC yesterday and running the same test again a second time directly afterwards there was no bigger difference than +/- 60 sec. And I saw differences around that also later again.
But anyway, I can do such a test again just to be sure.
@ab-tools ok, ensure that you are not making new models with Compile()
on each read, for best just use already pre-generated dlls.
I had the chance to test on the customer computer again and I can't believe it, but we really found the problem: In fact, .NET itself was the problem!
On the customer computer all Windows Updates were displayed (with the last update end of 2014) which is normal as it's a flight simulation computer - everybody that has a system that "just needs to run as it is" will at first disable all Windows Updates. :-)
But due to another application the customer uses he had to install .NET 4.6.0 on this computer manually.
Of course, he did not want to enable Windows Update, but I suggested to install the latest bug fix update .NET 4.6.2 manually again - and this did the trick!
Although our product is just based on .NET 4.0 after upgrading from .NET 4.6.0 to 4.6.2 the massive database loading delay on 64 Bit was immediately gone.
I really didn't think that .NET could be the problem here, but obviously the version 4.6.0 was a bit buggy. :-)
That was a hard way to go - thanks a lot for your support in finding this issue which at the end was not related to our both products at all!
@ab-tools, we learned the same lesson at least two times... Since then the first thing we do is running .NET Framework Repair tool and installing latest patch version (i.e. 4.5->4.5.2, 4.6->4.6.2). With .NET Core 2.0 this problem could be finally resolved - just ship it together with the product. .NET Core 2.0 has almost all API from .NET Framework so the transition should be smooth. But WPF applications could not run under .NET Core, alas.
@ab-tools cool, finally we resolved this
@aienabled: We didn't look into .NET Core yet to be honest. Our product is still WinForms based. Is that supported by .NET Core 2.0?
@AqlaSolutions: Two short other questions:
Thanks again! Andreas
@ab-tools ,
Our product is still WinForms based. Is that supported by .NET Core 2.0?
Unfortunately, no, and it seems it will never be supported. AFAIK, only UWP (which is very similar to WPF) and Xamarin Forms have support now.
@ab-tools,
Thank you both!
I now also got the DLL loading to work - you just need to use Assembly.Load(@"FlightControlModel")
, so without ".dll"!
But as the runtime compilation works fast, I probably won't use that, makes everything only more complicated adding another source of errors. ;-)
Thanks again for your great support! Andreas
Hello @AqlaSolutions again!
We are still using your great serialization library, just as a reminder, we are the ones have the bigger graph database as discussed some time ago here: https://github.com/AqlaSolutions/AqlaSerializer/issues/1
Everything really works well with a database file size of about 100 MB and a deserialization time of about 3-5 seconds on most modern computers which is fine.
But since with changed our product to use 64 bit if available (compiled with "Any CPU"), we do have few customers - all running Windows 7 64 bit - that complain about extremely long database loading times: Instead of 3-5 seconds it takes about 5-10 minutes for them!
For testing we then forced our product to start in 32 bit mode (via "CorFlags.exe") and the database loads "normally" fast again in just 3-5 seconds, but when switching back to 64 bit mode it gets immediately extremely slow again.
We cannot reproduce this problem ourselves, also not on a Windows 7 64 bit system. So we are running a bit out of ideas what could really cause this issue and how to track that down.
Do you have any idea what could cause such an extreme performance drop when deserialization in a 64 bit process instead of a 32 bit process?
Thanks in advance for any ideas on that Andreas