DomCR / ACadSharp

C# library to read/write cad files like dxf/dwg.
MIT License
414 stars 115 forks source link

Slow DWG reader #63

Closed gktval closed 1 year ago

gktval commented 1 year ago

Reading a dwg that is roughly 1400kb took 25 seconds to complete. I tracked down where the bottleneck was occurring and it happens in the DwgObjectSecionReader.cs. It seams that in every read loop, the stream gets copied at least 3 times in the getEntityType method. I changed this by creating a buffer in the initializer for the class and then creating a new memory stream from that buffer (that way the buffer only gets copied once when the class initializes). This reduced the read time to 500 ms. Just fyi. Great project.

DomCR commented 1 year ago

Hi @gktval,

That part of the code is quite messy and very delicate at the same time, Autodesk divides each object in 2 or 3 sections depending on the version, ObjectData|Handles and ObjectData|Text|Handles to read in order that is documented the code creates this 2 or 3 streams and puts them all together to read it.

Also there is a CrcCheck that slows the reader a lot because it performs a check in every byte of the stream, make sure that this option in the reader is set to false, this is not needed for the reader to do the job but is a validation for Autocad.

About your improvement, if you have the code please create a PR pointing to this repo, I'll be happy to review it.

Thanks!

Hexer611 commented 1 year ago

Hello, I have been trying to use this importer for a project. It worked great for small dwg file but the reading speed was so slow. I have been following the issues but didn't find anything that works for me. I tried what @gktval said works like a charm. It greatly increases the speed. I can create a pull request about this too. But I am nut sure if i sould create it or wait for @gktval . Great project @DomCR :D

gktval commented 1 year ago

I read the other "slow" comment issues with regards to the CrcCheck. It looks like it is by default set to false, so I did not have it turned on. I upgraded the framework to Net7, removed the shared classes by making them Libraries. I am also using Maui, which is a little buggy and had problems with some of the build configuration in the ACadSharp.csproj. All that to say, I don't feel comfortable creating a PR because I have changes that may affect other people's projects. But I am happy to share the file if you want to test it out. Unfortunately, I cannot share the dwg, though as I got it from a friend in an engineering company and it is proprietary.

DwgObjectSectionReader.zip

gktval commented 1 year ago

@Hexer611 If you make a PR, there are a few other problems I found. Do as you wish with them. Or @DomCR, if you prefer I can create separate Issues for each, but there are quite a few.

In https://github.com/DomCR/ACadSharp/blob/master/ACadSharp/Entities/ViewPort.cs, FrozenLayers method is never initialized and will throw a NullReferenceException in the CadViewportTemplate. (so just add FrozenLayers = new List<Layer>(); to the constructor.

In https://github.com/DomCR/ACadSharp/blob/master/ACadSharp/Classes/DxfClassCollection.cs, the list 'Dictionary' will throw an exception if an item is added twice (which in my case, the dxf for some reason has two items the same). Not sure what to do here, my easy fix was to change line 19 to this: _list[item.DxfName]= item; which just replaces any duplicate classes.

In the DxfReaderBase class, I get an InvalidCastException because this.LastValue is trying to convert a string into ULong in the LastValueAsHandle property. This is a result of the Style property in the MText Entity: https://github.com/DomCR/ACadSharp/blob/master/ACadSharp/Entities/MText.cs. The DfxCodeValue Attribute is set to Handle. As a result, it is trying to convert the string "Bold" to a ULong handle. Changing the DfxCodeValue attribute to Name fixes this.

Those are the ones I remember at the moment. Thanks.

Hexer611 commented 1 year ago

I faced some of the errors you mentioned, but since I wasn't sure what to do about them. I will wait a response from @DomCR before creating a PR about the speed error and those bugs. Thanks for pointing them out :)

DomCR commented 1 year ago

To keep the things sorted I would prefer to have different issues and different branches for each one of it, that way is easier to stay on track and have the different updates isolated form other possible issues.

About the creation of the branches:

Thanks a lot!

Hexer611 commented 1 year ago

You can try to read this 4MB file. I waited more than 10 minutes for this one modern_house.zip

DomCR commented 1 year ago

I've tried the file, 10 min and throws an Exception which is a NotImplementedException that could be avoided.

I'll work on the failsafe to avoid this kinds of errors and the performance enchantment.

Hexer611 commented 1 year ago

Just to be clear i want to ask a question to @DomCR . @gktval said that he was not comfortable with creating a branch, but you mentioned him to create a PR for the speed bug. I solved the speed problem too. In conclusion do I create a PR or are you waiting a response from @gktval ?

DomCR commented 1 year ago

I've created a PR for this, if you want you can push your branch to that PR, I'll compare both solutions and merge it.

Hexer611 commented 1 year ago

Okay, I will push my changes to "issue-63_slow-dwg-reader".

Edit: I don't have permission to do that, have I misunderstood you here?

DomCR commented 1 year ago

@Hexer611 push the changes to master then, thanks

Hexer611 commented 1 year ago

I created the pull request to the master. I checked the PR you created for this issue and you probably solved it by your commits, my PR is probably useless at this point :D I should mention that this is my first time contributing to a public project. I didn't even know about forking. I hope I didn't make any mistake creating the PR :D

DomCR commented 1 year ago

@Hexer611 don't worry, happy to have people working on this repo.

I'll check you PR and allow you to merge, @gkval and you are the ones that spotted this issue so I'm happy to use your PR.

Thank you both!