dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.48k stars 4.76k forks source link

BinaryFormatter Deserialize Datatable throw "Type * is not deserializable." Exception #35346

Closed DarkGraySun closed 4 years ago

DarkGraySun commented 4 years ago

I have an application in .NetFramework (4.7) that serializes a datatable binary to a file. When trying to deserialize in .Net Core 3.1 the exception is thrown "Type 'System.String' is not deserializable."

Here is the sample code: Serialization in .NetFramework 4.7:

 static void Main(string[] args)
        {
            var TestTable = new DataTable("TestTable");
            TestTable.Columns.Add(new DataColumn() { DataType = typeof(string), ColumnName = "TestColumn" });
            TestTable.RemotingFormat = SerializationFormat.Binary;
            var r = TestTable.NewRow();
            r["TestColumn"] = "Test";
            TestTable.Rows.Add(r);

            System.Runtime.Serialization.Formatters.Binary.BinaryFormatter binFormat = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
            using (var memstream=new System.IO.MemoryStream())
            {
                binFormat.Serialize(memstream, TestTable);
                System.IO.File.WriteAllBytes("TestTableSerialized", memstream.ToArray());
            }
        }

Deserialization in .Net Core 3.1:

static void Main(string[] args)
        {
            using (var memstream = new System.IO.MemoryStream(System.IO.File.ReadAllBytes("TestTableSerialized")))
            {
                System.Runtime.Serialization.Formatters.Binary.BinaryFormatter binFormat = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
                var table = (DataTable)binFormat.Deserialize(memstream);
            }
        }
danmoseley commented 4 years ago

This seems to mean that a string is getting serialized as a UnityType. Those are for types that never have more than one instance. We're not expecting that except for DBNull. It would need debugging on the .NET Framework side to understand why.

   at System.UnitySerializationHolder.GetRealObject(StreamingContext context)
   at System.Runtime.Serialization.ObjectManager.ResolveObjectReference(ObjectHolder holder)
   at System.Runtime.Serialization.ObjectManager.DoFixups()
   at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(BinaryParser serParser, Boolean fCheck)
   at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, Boolean check)
danmoseley commented 4 years ago

Does changing to TestTable.RemotingFormat = SerializationFormat.Xml; work for you? That uses a completely different mechanism so it won't hit this problem.

It is possible, but unlikely I think, that this is a duplicate of https://github.com/dotnet/runtime/issues/31611

GrabYourPitchforks commented 4 years ago

It means that somebody tried to deserialize typeof(string) (that is, the System.Type corresponding to System.String). In .NET Core, we don't allow Type instances to be deserialized via BinaryFormatter. Since presumably DataTable relies on this I wonder if using SerializationFormat.Binary is supported in general in .NET Core?

danmoseley commented 4 years ago

Ah - thank you @GrabYourPitchforks . Yes indeed:

https://referencesource.microsoft.com/#System.Data/fx/src/data/System/Data/DataTable.cs,363 info.AddValue(String.Format(formatProvider, "DataTable.DataColumn_{0}.DataType", i), Columns[i].DataType);

This means that you're correct, we cannot support SerializationFormat.Binary. And this issue is won't-fix.

Also given this https://referencesource.microsoft.com/#System.Data/fx/src/data/System/Data/DataTable.cs,238

we probably can won't fix #31611 since we will never encounter SerializationFormat.Xml in the blob. Do you agree?

GrabYourPitchforks commented 4 years ago

I think we still need to consider fixing https://github.com/dotnet/runtime/issues/31611. That issue discusses using .NET Core to serialize a DataSet, then passing that payload to Full Framework for deserialization. The fix should be a one-liner: slap a [TypeForwardedFrom] attribute on the SerializationFormat enum definition.

Or, to put it another way: 31611 talks about passing data from Core to Full Framework. This issue talks about passing data from Full Framework to Core. Depending on how we want to approach things we don't need to resolve both the same way.

danmoseley commented 4 years ago

Ah, of course. Well, we clearly have to close this one. @DarkGraySun hopefully you can use Xml serialization, or some other approach; .NET Core does not support binary serializing Type objects.

DarkGraySun commented 4 years ago

Thank you for your prompt reply. Unfortunately XML doesn't work for me. It has a significantly higher resource consumption (RAM). In a test with 2 million rows, the consumption was 500 MB in binary mode and 2 GB in XML mode.

danmoseley commented 4 years ago
  1. Can you share your scenario? Is this a matter of Core on server and Framework on app?
  2. Also, do you require serialization to go in the opposite direction as well?
  3. What are the data types of your columns? I'm wondering whther they are all simple types like string and int.
DarkGraySun commented 4 years ago

1.The Szeanrio: There is a 10 year old DotNetFramework Server app that sends compressed binary serialized data tables to the client apps. The client apps use this data as the basis for their calculations.

I now wanted to build a new client app with DotNetCore without changing the existing client apps and server apps. (Therefore I would like to avoid switching to XML.)

  1. At the moment I only need the DotNetFramework to DotNetCore direction.

  2. The data includes article data, customer data and price data so columns with int32, int64, string, decimal date etc. are required.

danmoseley commented 4 years ago

@DarkGraySun thanks. The problem we have is that binary serialization done with BinaryFormatter (what is used here) is intrinsically fragile and prone to security bugs. We have brought it forward to .NET Core, but limited the types it supports to a relatively small list that we can control more easily. Type is not and cannot be on that list. It's unfortunate if the column types are actually binary serializable (eg., int, string) because if the implementation had simply serialized their names, it's possible something could have been done. That's why I was curious, although it doesn't help solve this.

Unfortunately I think your options are either change the existing client/server apps (eg., to XML) or use .NET Framework in this case. If payload size is an issue -- perhaps it could be compressed eg with GZipStream before transmission.

GrabYourPitchforks commented 4 years ago

If the case we're talking about is a .NET Framework server sending data to a .NET Core client, and if the rows contain primitive data types, this is something that might have an immediate workaround by using BinaryReader and BinaryWriter. For example (pseudo-code):

// serializing
public void Serialize(DataTable myDataTable, BinaryWriter writer)
{
    foreach (DataRow row in myDataTable.Rows)
    {
        writer.Write((string)row["customerName"]);
        writer.Write((int)row["itemQuantity"]);
        /* etc. */
    }
}

For types like decimal you could use decimal.ToString("C", CultureInfo.InvariantCulture) or similar to turn it into something that the writer can understand. decimal.Parse can reverse the transformation on the reading side.

The BinaryWriter format is fairly compact. It can also be further run through a compression stream if you want to gzip it. You can read the data back by using BinaryReader on the client.

This would involve a code change on the server to support this new endpoint, so there is some work involved. But it should be a way to immediately unblock the ability to make forward progress. It also removes the dependency on the fragile / unsafe BinaryFormatter code paths as Dan mentioned earlier.

DarkGraySun commented 4 years ago

Thank you for your effort. I was hoping that "BinaryFormatter.SurrogateSelector" or "BinaryFormatter.Binder" would offer a possibility, but that doesn't work. I will now add a new endpoint to the server to work with compressed XML. Final when the server is switched to DotNetCore I will switch back to binary.