DomCR / ACadSharp

C# library to read/write cad files like dxf/dwg.
MIT License
421 stars 118 forks source link

Help needed: How to read substructures (e. g. CONTEXT_DATA in MULTILEADERSTYLE) #144

Closed mme1950 closed 1 year ago

mme1950 commented 1 year ago

We are using this very fine library to write a DWG-to-SVG converter preserving DWG structures such as BlockRecords.

We need to read MULTILEADER entities. A reader for MULTILEADER entities is currently not available.

We started implementing readers by creating a reader for MULTILEADERSTYLE objects. We are getting more and more into it and found that it was quite easy to understand after all.

The MULTILEADER entity, however, contains sbstructures such as CONTEXT_DATA {} and a list of LEADER objects etc.

Can anyone give a hint how to start reading a substructure.

DomCR commented 1 year ago

Hi @mme1950,

The current version of ACadSharp does not implement the entity MULTILEADER, I'll open a branch to take care of this issue.

mme1950 commented 1 year ago

Thank you for your reply. Do you have an idea how to read the substructures.

mme1950 commented 1 year ago

Classes to implement - suggested class names:

DomCR commented 1 year ago

Once I start working on the implementation I'll contact you with more information, right now I'm clueless.

mme1950 commented 1 year ago

Do you prefer to do implemention yourself? I would be interested to do it if you do not mind.

It seems that the reader infrastructure does not yet support these substructures. I did not find any examples such cases in the code.

DomCR commented 1 year ago

Of course! open a PR into this repo and I'll be happy to review it.

If you need any help with the code or you have any doubt about the existing code I'll be happy to help.

mme1950 commented 1 year ago

Hi @DomCR, I am currently trying to find out which enum types are needed for the implementation of MultiLeaderand MultiLeaderStyle. It seems that implemented types used for Leader cannot be reused, since the are defined differently:

Are we sure that the availabele documentation is correct? Perhaps it it wise to create leaders and multileaders in AutoCad an check the values.

BTW: Should not all enum types be in namespace ACadSharp.Types?

DomCR commented 1 year ago

Hi @mme1950,

About the documentation, I use the dxf online documentation in this link and the files in the reference folder.

"Are we sure that the availabele documentation is correct?" Autodesk documentation is a mess... not always match with what's in reality, all the projects that work with dwg/dxf have the same issues.

"create leaders and multileaders in AutoCad" is actually the best way to test it, I would suggest to create a sample and add the path to the test project, I'll add the instances in the samples that the project use for the generic testing, keep in mind to save the file in different versions so you make sure that works for all of them.

"Should not all enum types be in namespace ACadSharp.Types?" No, at some point I noticed that I had multiple enums representing the same and I decided put some of them in this namespace, it was a mistake, the folder and namespace should disappear and redistribute the enums in a more efficient way.

If you have to create enums add them in the namespace where they are used, if they are used in multiple levels add them in the top one, this is how it should be.

mme1950 commented 1 year ago

Hi @DomCR, I made some progress reading the data for MultiLeader entity.

Too many inconsistencies to proceed with trial and error.

mme1950 commented 1 year ago

Next step: Reading LEADER, LEADER_LINE substrucures succeeds.

DomCR commented 1 year ago

Hi @mme1950

To read the fields you can also use the _mergedReaders which manages the different stream offsets in the same way, there are some exceptions where the _mergedReaders should not be used but is easier if you want.

Note: if you need to check values by hand, you can use the static method DwgStreamReaderBase.Explore is quite tricky and very time consuming, you need to check each value with the real one in Autocad or somewhere else but if there is missing documentation is pretty useful.

DomCR commented 1 year ago

Feel free to create the PR with your changes anytime, that way I would be able to check the code and help you if you needed.

mme1950 commented 1 year ago

Hi @DomCR,

the Explore method seems to be very helpful. I should have noticed it before. Anyhow, I made a similar approach. I scanned the data stream in bit-steps to find the offset of fields with expected values found in the DXF export.

I now reached the end of the MultiLeader data, and it seems that I sccessfully read all properties for the an MuliLeader with MTEXT content.

Now I am trying to read a MultiLeader with BlockContent. An I see a very strange effect: I have to read several handles with some other fields inbetween, that are red correctly. Instead of the right handle values I get the value of the previous handle reference, respectively. I can see the right values in the DXF.

I do not fully understand the magic behind the different readers. Now I saw that in DwgObjectSectionReader nearly all handles are read with this.handleReference. I did it the same way, but it makes no difference.

DomCR commented 1 year ago

using this.handleReference stores the handle into the _handles to read the object that is pointing to, if you don't use this method you may risk the lose of information or struggle with an infinite loop reading the same object.

mme1950 commented 1 year ago

I found my problem: I missed to read one handle. All following handles received the value of the respective preceding handle. Now it seems to be OK. I am glad that it was so simple. But I have to look into the code to understand this behaviour.

mme1950 commented 1 year ago

I think I cannot push anything into the branch you opened because I have no rights.

@DomCR Can you grant me the rights to push changes.

DomCR commented 1 year ago

Push directly to master then it will be easier.

mme1950 commented 1 year ago

@DomCR - I am back from holyday.

Hi Albert, I think I cannot do anything in this repository because I do not have the permissions needed. I think you have to invite me and give me the required role. Otherwise, I would not mind sending my code as ZIP ...

DomCR commented 1 year ago

You can create a PR directly to this repo pointing to master, once you have the PR ready I'll review it.

mme1950 commented 1 year ago

@DomCR should I not work with the branch you created? I tried to update it with the changes commited to the master inbetween. This seems to be not possible for me. You suggest to clone the master, modify it, and create a PR - really?

DomCR commented 1 year ago

create a fork of this Repo, make the changes in your fork and then push them into master, this should do the trick.

mme1950 commented 1 year ago

seems starnge to me

DomCR commented 1 year ago

This is the usual way to go when you want to collaborate into another repo.

mme1950 commented 1 year ago

OK, I did no know that. We forked the repository.

DomCR commented 1 year ago

Hi @mme1950,

I've seen that you opened 2 PRs pointing to this repo but you've closed them right away, are you still interested in collaborating with this project or will you be developing your own fork?

mme1950 commented 1 year ago

Sorry, this was a mistake. We are, of course highly interested, to get your comments and feedback. How is the normal process to take over the changes in the forked project into the main project?

Currently, we still have to investigate some problems. Data we read from DWG and we see in DXF seem to be different from what we see in AutoCAD. Obviously the description of the group code in dPDF document is not correct. And there is a gap of two bytes in the data stream wie would like to understand. Another open task is to take care of the version dependent fields.

mme1950 commented 1 year ago

Hi @DomCR, I am not familiar with the collaboration process. So when we want to develop an additional feature for your library the usual process is to fork it and create a PR in the forked project. Then you can review the PR in the forked project. Can you transfer the PR to your repository? What should I do now to enable the next steps?

DomCR commented 1 year ago

The usual process goes by this steps:

  1. Create a fork of the repository that you want to collaborate.
  2. Create a branch in your fork so you can work on the issue that you want to fix.
  3. Once you have finished your task, create a PR to the original repository.
  4. Wait for the review and the merge into the original repository, then you changes will be added to the source.

I've found a couple sites that explain this process in more detail, hope that helps: Simple and quick to understand: https://github.com/firstcontributions/first-contributions Blog with helpful links: https://www.makeuseof.com/how-to-contribute-to-open-source-projects/

mme1950 commented 1 year ago

@DomCR Thanks!

We still have to solve some problems. Then we will craete a PR.

DomCR commented 1 year ago

You can open the PR whenever you feel ready, you have the option to setup the PR as a Draft for an incomplete feature.

mme1950 commented 1 year ago

Hi @DomCR we opened a PR. Didyou have time to review it?

mme1950 commented 1 year ago

Hi @DomCR, can you help me with the version infos in the document "Open Design Specification for .dwg files"? In the MLEADER and MLeaderAnnotContext table some hints regarding the ACAD-Version are given. I do not know how to read it?

In the MLEADER table I find the Version "R2010+" before the first field to read. I doubt that it means that all following fields were not present before R2010. The next version info is "-R2007". I seems that the arrowhead collection following this line is missing in the current version (2023) but I must read all following fields up to the version info "R2010+". There is no Boolean indicating the "-R2007" condition.

In the MLeaderAnnotContext table version infos seem to be more reasonable. A section labelled with "R2010" is ended by "Common". However, the "end reprat" over leader-root fields is marked with "R2010" while the "begin repeat" is not.

DomCR commented 1 year ago

Hi @mme1950

About the versions you can check the class DwgSectionIO in there you will have the conditionals and how they are treated.

In the MLEADER table I find the Version "R2010+" before the first field to read. I doubt that it means that all following fields were not present before R2010.

That's what it means, the version indicates the fields that are only used for that version specifically, sometimes they are in a different order.

I would recommend to copy the lines from the PDF to the code so you don't make any mistake when adding the conditional for the versions, is a slow process because you have to handle each version separately but that's the only way to go.

mme1950 commented 1 year ago

Hi @DomCR, sorry, I missed the line Common version info at the beginning of the MLEADER table. So all fields until -2007 are common, i.e. should appear in the current version. Then we have the sequence beginning after -R2007 until R2010+. Assuming -R2007 means until version R2007 these fields should not appear in the current version, but some of them definitely do. In a DXF file of the current version I see the fields following the group code 293 before -R2007 and after it the group code 294 etc. In DWG I can read ONE zero BL value. All following fields can be read properly.

So the -R2007 seems to be wrong. I simply ommit the Arrowhead list.

mme1950 commented 1 year ago

Hi @DomCR,

general remark: according to my understanding a style object provides a reusable set of standard properties for the Entity it is refernced by, e.g. a MultiLeaderStyle referenced by a MutiLeader. Thus it seems not appropriate to clone the style object witrh the entity.

However, in the Clone method of other entities, e.g. MText the style is cloned.

DomCR commented 1 year ago

Hi @DomCR,

general remark: according to my understanding a style object provides a reusable set of standard properties for the Entity it is refernced by, e.g. a MultiLeaderStyle referenced by a MutiLeader. Thus it seems not appropriate to clone the style object witrh the entity.

However, in the Clone method of other entities, e.g. MText the style is cloned.

Yes that's how they work, but they need to be cloned because the clones will be detach from the CadDocument that they are currently in, if that does not happen the file may have wrong references and cause it to fail when loading in other frameworks.

I know that this is counter intuitive and less practical when programming but by now is the only way to ensure the file integrity.

DomCR commented 1 year ago

Hi @DomCR, sorry, I missed the line Common version info at the beginning of the MLEADER table. So all fields until -2007 are common, i.e. should appear in the current version. Then we have the sequence beginning after -R2007 until R2010+. Assuming -R2007 means until version R2007 these fields should not appear in the current version, but some of them definitely do. In a DXF file of the current version I see the fields following the group code 293 before -R2007 and after it the group code 294 etc. In DWG I can read ONE zero BL value. All following fields can be read properly.

So the -R2007 seems to be wrong. I simply ommit the Arrowhead list.

The minus sing is before when in all other cases is after the year, may be a typo in the PDF document, I haven't check the values myself bug it would make sense.

mme1950 commented 1 year ago

Hi @DomCR, sorry, I missed the line Common version info at the beginning of the MLEADER table. So all fields until -2007 are common, i.e. should appear in the current version. Then we have the sequence beginning after -R2007 until R2010+. Assuming -R2007 means until version R2007 these fields should not appear in the current version, but some of them definitely do. In a DXF file of the current version I see the fields following the group code 293 before -R2007 and after it the group code 294 etc. In DWG I can read ONE zero BL value. All following fields can be read properly. So the -R2007 seems to be wrong. I simply ommit the Arrowhead list.

The minus sing is before when in all other cases is after the year, may be a typo in the PDF document, I haven't check the values myself bug it would make sense.

At least one of the two collections following the -R2007 do not apperar in the current-verision DWG. Thus it seems to make sense that -R2007 means until R2007. The Arrowhead collection may be an obsolete concept to specify individual arrowheads for each LeaderLine. After R2010 Arrowheads can be associated with each LeaderLine. But how was it between R2007 and R2010?

DomCR commented 1 year ago

I'm sorry but I don't have an answer to that, you can try to find the documentation for the DXF format for this specific years and compare it to the current one.

You can try to look in https://help.autodesk.com/ but the older versions don't appear, the oldest that I could find changing the year in the URL was https://help.autodesk.com/view/ACD/2015/ENU/

mme1950 commented 1 year ago

Hi @DomCR, general remark: according to my understanding a style object provides a reusable set of standard properties for the Entity it is refernced by, e.g. a MultiLeaderStyle referenced by a MutiLeader. Thus it seems not appropriate to clone the style object witrh the entity. However, in the Clone method of other entities, e.g. MText the style is cloned.

Yes that's how they work, but they need to be cloned because the clones will be detach from the CadDocument that they are currently in, if that does not happen the file may have wrong references and cause it to fail when loading in other frameworks.

I know that this is counter intuitive and less practical when programming but by now is the only way to ensure the file integrity.

I do not fully understand the implication of this rule. Must all referenced objects be cloned? Some other questions arise:

mme1950 commented 1 year ago

I'm sorry but I don't have an answer to that, you can try to find the documentation for the DXF format for this specific years and compare it to the current one.

You can try to look in https://help.autodesk.com/ but the older versions don't appear, the oldest that I could find changing the year in the URL was https://help.autodesk.com/view/ACD/2015/ENU/

The pdf document says that the MultiLeader entity was introduced with version 21 = AC1014 = R13_14Only, i.e. before AC1021 =R2007.

I did not find anything about R2007.

spitfirekmt commented 1 year ago

AutoCAD. DXF Reference. 2004.pdf AutoCAD. DXF Reference. 2005.pdf AutoCAD. DXF Reference. 2008.pdf AutoCAD. DXF Reference. 2010.pdf AutoCAD. DXF Reference. 2011.pdf AutoCAD. DXF Reference. 2012.pdf AutoCAD. DXF Reference. 2013.pdf AutoCAD. DXF Reference. 2014.pdf AutoCAD. DXF Reference. Release13.txt AutoCAD. DXF Reference. Release10_NOT_OFFICIAL.txt AutoCAD. DXF Reference. Release12_NOT_OFFICIAL.txt

Hello, I hope this well help in some way.

mme1950 commented 1 year ago

Thank you very much,

mme1950 commented 1 year ago

Hi @DomCR

From DXF-Documentation we can see:

Using the specification in the PDF _OpenDesign_Specificationfor.dwgfiles wie can read DWG files created by the current AutoCAD 2023 version.

Conclusion:

DomCR commented 1 year ago

Hi @mme1950,

to address this version incompatibility the methods readMultiLeader and readMultiLeaderStyle should check for the file version and just return null, the reader automatically will notify the user that the object hasn't been read. As you said we should not throw an exception, this would cripple the reader system.

mme1950 commented 1 year ago

OK, this is how I implemented it.