xarial / xcad

Framework for developing CAD applications for SOLIDWORKS, including add-ins, stand-alone applications, macro features, property manager pages, etc.
https://xcad.net
MIT License
126 stars 25 forks source link

Trouble writing more than about 6,000 characters to third party storage #86

Closed flycast closed 2 years ago

flycast commented 2 years ago

I have been fighting this for two days now. I thought it was a serialization issue but now I do not. I have been fighting to get serialization working on a complex entity. I have been using ExtendedXmlSerizer for my serialization and think I have it working. I am serializing a Dictionary<List<byte[]>, IMyCustomInterface>. When I started having the issue I worked progressively from serializing and deserializing smaller to larger objects:

  1. A single character
  2. A string
  3. A byte array
  4. A list of byte arrays
  5. My list of all the keys of List[byte[]>
  6. A single IMyCustomInterface.
  7. Simpler, smaller dictionary<string, string>.
  8. My full dictionary of Dictionary<List<byte[]>, IMyCustomInterface>.

Once I serialize and save enough content to the stream to push the serialized result past about 6000 or so characters I start having errors on deserilization. I get a message that indicates that the XML is messed up. The errors change depending on what I have serialized.

When I examine the text that deserializes I see it has been obviously truncated before it gets to the end of the Xml.

My last test was to save a string of x characters to third part storage. It seems doing this I can reliably serialize and save about 6000 characters and deserialize them again on load. Somewhere around 6087 characters it fails.

I do not get any exceptions on saving only on opening. I get the same issue if I use ExtendedXmlSerializer or XmlSerializer. I would love to use ExtendedXmlSerializer though because it makes serializing complex custom classes much easier.

My read code looks like this:

        private void HandleStorageReadAvailable(IXDocument doc)
        {
            using var storage = doc.TryOpenStorage("XXXXXX", AccessType_e.Read);
            {

                if (storage is null) return;
                using var str = storage.TryOpenStream("EntityDatabase", true);
                try
                {
                    var xmlSer = new XmlSerializer(typeof(string));
                    var temp = (string)xmlSer.Deserialize(str);
                    Debug.WriteLine("Successfully read from EntityDatabase");
                    Debug.WriteLine("");
                    Debug.WriteLine(temp);
                    MessageBox.Show($"String length read:{temp.Length.ToString()}");
                }
                catch (Exception ex)
                {
                    MessageBox.Show("Read failed");
                    Debug.WriteLine(ex);
                }

            }
        }

My write code looks like this:

        private void HandleStorageWriteAvailable(IXDocument doc)
        {
            using IStorage storage = doc.OpenStorage("XXXXXX", AccessType_e.Write);
            {

                //Test
                using (Stream str = storage.TryOpenStream("EntityDatabase", true))
                {
                    string p = @"C:\Users\erics\Desktop\test7.xml";

                    string test  = new String('-', 6088);

                    try
                    {
                        var xmlSer = new XmlSerializer(typeof(string));

                        xmlSer.Serialize(str, test);
                        Debug.WriteLine("Successfully Wrote to EntityDatabase on Save");

                    }
                    catch (Exception ex)
                    {
                        Debug.WriteLine(ex);
                    }

                }
            }
        }

Is this a Solidworks limitation? A bug in XCad? Serialization issue?

An exception example I get is as follows. The issue was the Xml was truncated:

Exception thrown: 'System.Xml.XmlException' in System.Xml.dll
Exception thrown: 'System.InvalidOperationException' in System.Xml.dll
System.InvalidOperationException: There is an error in XML document (2, 6106). ---> System.Xml.XmlException: Data at the root level is invalid. Line 2, position 6106.
   at System.Xml.XmlTextReaderImpl.Throw(Exception e)
   at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
   at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
   at System.Xml.XmlReader.ReadElementString()
   at System.Xml.Serialization.XmlSerializationPrimitiveReader.Read_string()
   at System.Xml.Serialization.XmlSerializer.DeserializePrimitive(XmlReader xmlReader, XmlDeserializationEvents events)
   at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
   --- End of inner exception stack trace ---
   at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
   at System.Xml.Serialization.XmlSerializer.Deserialize(Stream stream)
   at Dispensing.DocumentHandler.HandleStorageReadAvailable(IXDocument doc) in C:\Users\erics\source\repos\TestingEFClassLibrary\TestingEFClassLibrary\DocumentHandler.cs:line 163
artem1t commented 2 years ago

I think this is the issue with XmlSerializer itself. XmlSerializer is used to serialize complex classes. You are trying to serialize string as a single value (I believe this is what confuses XmlSerializer). I believe if you will try to serialize data into for example file (not SW or xCAD related) you will have the same issue. For example here is a similar issue: https://github.com/pwntester/ysoserial.net/issues/24

If you want to just serialize string simply use StreamWriter and StreamReader (https://stackoverflow.com/questions/2629612/writing-string-to-stream-and-reading-it-back-does-not-work)

flycast commented 2 years ago

It may very well be the XmlSerializer however it happens when I serialize a complex class. I was serializing a string just as a test.

Perhaps I wasn't clear in my first post. I started having issues serializing the complex class. Then I stepped back down to very simple data to troubleshoot. The super large string was just a test in a whole lot of tests. I can serialize and deserialize my dictionary<List<byte[]>, IMyCustomInterface> when my dictionary has between 1 - 10 elements. When my dictionary has 11 elements it does not fail on saving or serializing but when I load it does not deserialize properly. The string I get when Ideserialize is truncated xml.

Also, as I test from 1 to 10 elements in my dictionary my resulting SolidWorks part file size grows at about the same amount for each additional element. Once I start saving more than 10 elements the filesize stop growing after the save.

Also, when I serialize and save to a text file I get the entire xml. The xml is not truncated in the text file. IT is only when I serialize into the solidworks file that the truncation happens.

artem1t commented 2 years ago

I have just made a quick example to rule out the size limitation (please see below). It is working OK for me regardless of the size of the string. I have tried your original size of 6,088 and then changed to 100,000 and both scenarios work OK.

Just few notes:

Also, feel free to send me an e-mail and we can arrange for a remote session so we can troubleshoot live.

ThirdPtyStoreLargeDataTest.zip

flycast commented 2 years ago

Thank you for the example. It indeed worked. I went back and tried mine. It failed. I carefully examined it for differences. I did remove a line or two of testing code that assigned variables values. Still didn't work. I then changes the name of the stream. It worked. I tried it on two files.

image

flycast commented 2 years ago

I also changed your test program to use "Sharpline.Dispense" for IStorage and "EntityData" for the stream. It had the same issue. THis is in a file that I have previously written data to using those names.

flycast commented 2 years ago

Please try this file using "Sharpline.Dispense" for IStorage and "EntityDatabase" for stream name BlockResearch2.zip .

artem1t commented 2 years ago

Thank you for the sample model. I can reproduce this with your model only. I.e. if I open this document I see the exception on the deserialize. Even if I save the document thus overriding the stream the issue is still there. But...

private void OnStorageWriteAvailable(IXDocument doc)
        {
            using (var storage = doc.OpenStorage(STORAGE_NAME, AccessType_e.Write))
            {
                storage.RemoveSubElement(STREAM_NAME);

                using (var stream = storage.TryOpenStream(STREAM_NAME, true))
                {

So there is some corruption with this particular stream. I have not experience this before, but I have some idea what can cause this:

I am using 3rd party storage quite extensively in my applications in a very similar manner to your scenario and haven't experience issues so I am quite keen to investigate the cause of this issue and address this in xCAD.

flycast commented 2 years ago

Still having issues with writing to Third Party. I can certainly confirm one that seems to be repeatable. If I start my addin in debugging mode using Visual Studio running as Admin user. I then create a brand new document. I create a sketch and add one single line to the sketch. I then build my "entity database" to have third party data to save. I save and the HandleStorageWriteAvailable(IXDocument doc) handler runs. I get the following error for the first few minutes OR on the first save, I am not sure which:

Exception thrown: 'System.Runtime.InteropServices.COMException' in Xarial.XCad.Toolkit.dll The exception:

System.Runtime.InteropServices.COMException (0x80030005): Access Denied. (Exception from HRESULT: 0x80030005 (STG_E_ACCESSDENIED))
   at System.Runtime.InteropServices.ComTypes.IStream.Read(Byte[] pv, Int32 cb, IntPtr pcbRead)
   at Xarial.XCad.Toolkit.Data.ComStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.StreamReader.ReadBuffer()
   at System.IO.StreamReader.ReadToEnd()
   at ExtendedXmlSerializer.ExtensionModel.Xml.InstanceFormatter.Get(Object parameter)
   at ExtendedXmlSerializer.ExtensionMethodsForSerialization.Serialize(IExtendedXmlSerializer this, Stream stream, Object instance)
   at Dispensing.DocumentHandler.HandleStorageWriteAvailable(IXDocument doc)

If I wait a few minutes and save again then the exception does not happen. I don't know if it is because I waited or because it was the second save.

I have been debugging my application. I have been starting solidworks, creating brand new files and then saving. I am still (even with brand new files) having sporadic and unpredictable issues with truncated XML when I create a new file , unload it and then open and read the file. I get no exceptions when saving. I do get exceptions that the xml I read was truncated.

I did think that the test of 100,000 characters was good but then I have started having issues when I go back to my actual class serialized again.

Lastly, I have been trying to find a way to view the Solidworks file after it is written to actually look at what is written. It is not text and I cannot determine the format it is written in. If I could view the actual file contents to determine what was written I think it would be easier to tell where to look next. Do you know of a way to view the third party storage area in plain text?

EDIT After going back and testing some more. The first save throws the exceptions above in this post. THe second save always succeeds.

flycast commented 2 years ago

More info. I changed the way I am reading to read the entire stream byte by byte and concatenate them into a string:

while (!textReader.EndOfStream)
{
    contents += textReader.Read() + ", ";
}

After doing this the variable contents holds a huge string of text. The text has 234612 commas meaning that the stream is about the same number of bytes long. When I tried to decode the text that looks like this (small example):

60, 63, 120, 109, 108, 32, 118, 101, 114, 115, 105, 111, 110, 61, 34, 49, 46, 48, 34, 32, 101, 110, 99, 111, 100, 105, 110, 103, 61, 34, 117, 116, 102, 45, 56, 34, 63, 62, 60, 69, 110, 116, 105, 116, 121, 68, 97, 116, 97

I run into an illegal byte value of 65279 at around 45,000 characters. This value appears numerous times through the text after some repeating zeros:

0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 65279, 60, 63, 120, 109, 108, 32, 118, 101, 114, 115, 105, 111, 110, 61, 34, 49, 46, 48, 34, 32, 101, 110, 99, 111, 100, 105, 110, 103, 61, 34, 117, 116, 102

Every time it does it looks like there is old data that I have saved after it. Most of those instances of old data are with no items and complete XML.

Looking deeper my entire data and complete data XML goes from byte 1 - 14,733. What I thought was truncated XML was actually my full xml with some overwritten data after as if garbage was appended to the end of my data. Here is a clip of the ending tag </EntityDatabase> and over written old data starting with lserializer.github.io/v2:

</EntityDatabase>lserializer.github.io/v2" exs:item="unsignedByte">sDYAAAEAAAD//v8AAAAAABoAAAA=</Array><Array xmlns:exs="https://extendedxmlserializer.github.io/v2" exs:item="unsign

I don't know how Solidworks handles the writing of the stream - is it supposed to clear all the old data and only write the new data or what? It is clear that old data is still in there and it seems like new data is getting written on top of the old at the front somehow.

I suspect at this point that the deserializing is reading past the ending tag. As a workaround I guess I could:

  1. Read the stream into text.
  2. Search for the ending tag
  3. Truncate the text after the ending tag.
  4. Deserialize that resulting string into my object.

A really, really ugly workaround.

I'll be contacting ExtendedXMLSerializer folks to see what their thoughts are.

artem1t commented 2 years ago

I would recommend deleting the stream from storage before writing (I think this is more effective workaround). If you have a time we can arrange for an online session to troubleshoot next week.

artem1t commented 2 years ago

This might be something to do with IStorage implementation and the way it manages the streams. If I can reproduce it from a new part I will be able to investigate further.

flycast commented 2 years ago

How would I delete the stream before writing?

artem1t commented 2 years ago

Just call RemoveSubElement from storage and pass stream name to delete the stream

flycast commented 2 years ago

Artem - Thanks for your help. After extensive testing today (and a little education) I understand what is happening. Deleting the stream before writing did help quite a bit. I have narrowed this down to an issue with ExtendedXMLSerializer.

Longer version: I am using Visual Studio and upgraded to 2022. My settings for the Output -> Debug window had changed to show Exception Messages. I have honestly never seen this before. I was seeing exceptions in the Debug window that were being displayed from xCad. THe exceptions were handled in XCad though. Since I was having issues with ExtendedXMLSerializer and the exception was being displayed from xCad I thought it was xCad. I think the xCad exception was for when it was trying to access a stream that was not there yet. I am using the doc.TryOpenStorage so the exception happened, then xCad handled it.

All god now. Sorry for bothering you.