zzzprojects / Bulk-Operations

C# SQL Bulk Operations | High-performance C# bulk insert, update, delete and merge for SQL Server, SQL Azure, SQL Compact, MySQL, and SQLite.
https://bulk-operations.net
142 stars 36 forks source link

Strip Characters from Text column #70

Closed JenniferBlau-Mattang closed 3 years ago

JenniferBlau-Mattang commented 3 years ago

I am working with a Bulk insert to load data from a legacy system into a postgres db. The file format has null terminated strings which postgres does not support. Error message from postgres is "ERROR: invalid byte sequence for encoding “UTF8”: 0x00"

I need to strip the null bytes from the text during the bulk insert. I have been attempting to accomplish this with the source value factor, but have had no luck. Do you have any suggestions?

Bulk operation Code:

using (var bulk = new BulkOperation(conn))
{
    try
    {
        bulk.Connection = conn;
        bulk.AutoMap = AutoMapType.ByName;
        bulk.BatchSize = 10000;
        bulk.BatchTimeout = 360;
        bulk.Provider = ProviderType.PostgreSql;
        bulk.DestinationSchemaName = "public";
        bulk.DestinationTableName = tableName.ToLower();
        bulk.CaseSensitive = CaseSensitiveType.DestinationInsensitive;
        bulk.ColumnMappings = BuildColumnMapping(columns);
        bulk.AutoTruncate = true;
        bulk.DataSource = dataReader;
        bulk.UseLogDump = true;
        bulk.LogDump = sb;

        bulk.BulkInsert();

        lbMessages.Items.Add("Insert Data Completed Successfully for Table: " + tableName);
    }
    catch (SqlException ex)
    {
        lbMessages.Items.Add($"LogDump:{sb.ToString()}");
        lbMessages.Items.Add("SQL Exception:" + ex.Message + ", Stack Trace:" + ex.StackTrace);
    }
    catch (Exception ex2)
    {
        lbMessages.Items.Add($"LogDump:{sb.ToString()}");
        lbMessages.Items.Add("Exception:" + ex2.Message + ", Stack Trace:" + ex2.StackTrace);
    }
    conn.Close();
    conn.Dispose();

}

Build Column mapping

    if(dataType == typeof(System.String))
    {
        colMap = new ColumnMapping();
        colMap.SourceName = columName.ToString();
        colMap.DestinationName = columName.ToString();
        colMap.SourceValueFactory = x => {
            var r = (IDataReader)x;
            var stringValue = r.GetString(r.GetOrdinal($"{columName}"));
            var value = stringValue.Replace(Convert.ToChar(0x00).ToString(), string.Empty).Replace('\u0000'.ToString(), "");
            return value;
        };
    }
    else
    {

        colMap = new ColumnMapping();
        colMap.SourceName = columName.ToString();
        colMap.DestinationName = columName.ToString();
        colMap.DefaultValueResolution = DefaultValueResolutionType.Null;

    }
    mappings.Add(colMap);
}
JonathanMagnan commented 3 years ago

Hello @jennifer-bitwerx ,

My suggestion would probably have been exactly/similar to what you did.

At this point, you specify yourself the value so this 0x00 should no longer be part of the value you insert.

Is it possible for you to provide a runnable project with the minimum code to reproduces the issue? We will not be able to reproduce it otherwise since your fix should work so there is something else that's currently missing which we are not aware.

If you wish, you can send the project in private here: info@zzzprojects.com

Best Regards,

Jon

JonathanMagnan commented 3 years ago

Hello @jennifer-bitwerx ,

Since our last conversation, we haven't heard from you!

Don't hesitate to provide a runnable project with the minimum code to reproduces the issue in private info@zzzprojects.com.

Looking forward to hearing from you,

Jon

JenniferBlau-Mattang commented 3 years ago

Hi! Sorry for the delay, I’m trying to trim out some code to send a sample in. In the meantime, I tested the same code going to sql server rather than postgres and did have success with that. Unfortunately that is not an acceptable solution.

Thanks Jennifer

From: Jonathan Magnan @.> Sent: Monday, July 19, 2021 10:45 AM To: zzzprojects/Bulk-Operations @.> Cc: Jennifer Blau @.>; Mention @.> Subject: Re: [zzzprojects/Bulk-Operations] Strip Characters from Text column (#70)

Hello @jennifer-bitwerxhttps://github.com/jennifer-bitwerx ,

Since our last conversation, we haven't heard from you!

Don't hesitate to provide a runnable project with the minimum code to reproduces the issue in private @.**@.>.

Looking forward to hearing from you,

Jon

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zzzprojects/Bulk-Operations/issues/70#issuecomment-882607863, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMK2WDJTNAPJEA4SOTKVZ7LTYQ3ATANCNFSM5AODY66A.

JonathanMagnan commented 3 years ago

No problem, take your time.

As soon as we will get your code we will look at it.

Best Regards,

Jon

JonathanMagnan commented 3 years ago

Hello @jennifer-bitwerx ,

This issue will be closed since we haven't heard back from you.

However, don't hesitate to contact us once you'll be able to provide more information.

Best Regards,

Jon