zzzprojects / EntityFramework-Extensions

Entity Framework Bulk Operations | Improve Entity Framework performance with Bulk SaveChanges, Insert, update, delete and merge for SQL Server, SQL Azure, SQL Compact, MySQL and SQLite.
https://entityframework-extensions.net
346 stars 57 forks source link

Bulk SaveChanges/Insert cut off byte array #292

Open Jagailo opened 5 years ago

Jagailo commented 5 years ago

Description

Hello @JonathanMagnan,

When adding to the db more than 2 files (pdf in my case) in the form of byte[], BulkSaveChanges truncates the arrays. This is not compression, it's truncate that can be seen by looking at the file itself if you download it from the db. This is only observed when BulkInsert() and BulkSaveChanges(), possibly during BulkUpdate(), but I didn't check. During normal SaveChanges(), files are added completely. For the column I use MEDIUMBLOB type.

Size in the db of copies of the same file saved in different ways: img

Further technical details

JonathanMagnan commented 5 years ago

Hello @Jagailo ,

Thank you for reporting,

My developer will look at it.

Best Regards,

Jonathan

JonathanMagnan commented 5 years ago

Hello @Jagailo ,

Thank you again for reporting but unfortunately, my developer is not able to reproduce it.

Is it possible to help us to discover what we are missing to get the same issue as you? This is something we would definitively want to fix.

If you could provide a runnable solution, it would be best.

Here is one of the examples he made:

using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations.Schema;
using System.Text;
using Microsoft.EntityFrameworkCore;
using MySql.Data.MySqlClient;

namespace Lab.EFCore20.MySqlPomelo
{
    class Request_MediumBlogByteArray
    {
        public static void Execute()
        {
            // Create BD 
            using (var context = new EntityContext())
            {
                context.Database.EnsureDeleted();
                context.Database.EnsureCreated();
            }

            // CLEAN  
            using (var context = new EntityContext())
            {
                context.EntitySimples.RemoveRange(context.EntitySimples);

                context.SaveChanges();
            }
            var byteArrays = new List<Byte>();

            for (int i = 0; i < 300000; i++)
            {
                byteArrays.Add(0x20);
                byteArrays.Add( 0x30);
            } 
            // SEED  
            using (var context = new EntityContext())
            {
                for (int i = 0; i < 3; i++)
                {
                    context.EntitySimples.Add(new EntitySimple { ColumnInt = i, Bytes = byteArrays.ToArray() });
                }

                context.SaveChanges();
            }

            // TEST  
            using (var context = new EntityContext())
            {
                List<EntitySimple> list = new List<EntitySimple>();
                list.Add(new EntitySimple { ColumnInt = 10 }); 
                list.Add(new EntitySimple { ColumnInt = 12, Bytes = byteArrays.ToArray() }); 
                list.Add(new EntitySimple { ColumnInt = 12, Bytes = byteArrays.ToArray() });

                context.BulkInsert(list, options => options.BatchSize  = 1);
            }

            // TEST  
            using (var context = new EntityContext())
            {
                for (int i = 0; i < 3; i++)
                {
                    context.EntitySimples.Add(new EntitySimple { ColumnInt = i, Bytes = byteArrays.ToArray() });
                }

                context.BulkSaveChanges();
            }
        }

        public class EntityContext : DbContext
        {
            public EntityContext()
            {
            }

            public DbSet<EntitySimple> EntitySimples { get; set; }

            protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
            {
                optionsBuilder.UseMySql(new MySqlConnection(My.Connection));

                base.OnConfiguring(optionsBuilder);
            }
        }

        public class EntitySimple
        {
            public int ID { get; set; }
            public int ColumnInt { get; set; }
            public string ColumnString { get; set; }

            [Column(TypeName =  "mediumblob")]
            public byte[] Bytes { get; set; }
        }
    }
}
Jagailo commented 5 years ago

Sorry for not doing this right away, I was busy. Here is a solution to reproduce a bug. BulkIssue-292.zip

JonathanMagnan commented 5 years ago

Hello @Jagailo ,

Thank you for the project,

We tried on several developer machines and different environment but we keep getting the right result:

image

The 2 first row is using SaveChanges, 2 seconds row is SaveChangesAsync and 2 last rows BulkSaveChangesAsync.

Everything work as expected on our side.

Do you have an idea of what we could miss for reproducing this issue?

Jagailo commented 5 years ago

I'm so sorry, but I was sure that I wrote type of database, it turns out that it isn't. Have you tried to reproduce a bug with MariaDB 10.3.11 or 10.3.13?

JonathanMagnan commented 5 years ago

Still working with MariaDB 10.2.14 ;(

However, we upgraded to MariaDB 10.3.12 and we now get a similar error.

We don't have any custom code that depends on the MariaDB version so that's currently weird. We will investigate it to see why this is happening in version 10.3.x

At least we can now progress on the issue ;)

JonathanMagnan commented 5 years ago

Hello @Jagailo ,

We thought at first to report this issue to MariaDB but starting from v10.3, they made some heavy breaking change and I believe that now work as intended with a derived table (selecting data from subquery using SELECT union).

Someone else reported a similar problem but with MySQL v8.x and decimal value.

We are currently making some heavy modifications to our code to now use CAST in our derived table. Once done, it will fix this kind of issue as the type with the newest version of MariaDB and MySQL get lost when using the derived table which leads to this error.

A fix should be available next week.