oracle / dotnet-db-samples

.NET code samples for Oracle database developers #OracleDotNet
http://otn.oracle.com/dotnet
MIT License
413 stars 191 forks source link

EF Core 8.0 - can't fetch data from db when special characters are used in a query #353

Closed ElAdriano closed 1 month ago

ElAdriano commented 8 months ago

Problem: In latest version of Oracle.EntityFrameworkCore lib can't query data from db when special characters are explicitly used in sent query.

Environment: Lib version : 8.21.121 .NET SDK version : 8.0.100 Database : Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production Version 19.3.0.0.0 (I'm connecting to existing one and don't run CreateDatabase)

Description: When use following code

var db = new OracleContext();

var entity = new B1TEST();
entity.Mynchar = "ęłćń";
db.B1TestRepo.Add(entity);
db.SaveChanges();

**var result = db.B1TestRepo.Where(e => e.Mynchar == "ęłćń").FirstOrDefault();**
Console.WriteLine(result.Mynchar);

I receive NullReferenceException because of null result returned by .Where clause - despite the fact that in database there was created record with Mynchar = "ęłćń" (shown below). image image

In logs, the following query (with corrupted special characters) was sent to db:

info: 10/01/2024 15:48:02.164 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (8ms) [Parameters=[], CommandType='Text', CommandTimeout='0']
      SELECT "b"."MYNCHAR"
      FROM "B1TEST" "b"
      WHERE "b"."MYNCHAR" = N'elcn'
      FETCH FIRST 1 ROWS ONLY

However there were also 2 things worth attention:

  1. Correct (uncorrupted) characters were present in DbCommand in interceptor image
  2. Logically the same query (comparison to the same value with special characters), but constructed slightly different (with a use of variable instead of explicit string) works (see example below). image

Steps to recreate

Table DDL (create table in database):

CREATE TABLE "XXORA"."B1TEST" 
(   
   "MYNCHAR" NCHAR(4) DEFAULT ' ' NOT NULL ENABLE,
    CONSTRAINT "B11TEST_PRIM_KEY" PRIMARY KEY ("MYNCHAR")
)

Full C# code (Program.cs):

using System.Data.Common;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Diagnostics;
using Microsoft.Extensions.Logging;

namespace DbNcharTest;

public class OracleCommandInterceptor : DbCommandInterceptor
{
    public override InterceptionResult<int> NonQueryExecuting(DbCommand command, CommandEventData eventData, InterceptionResult<int> result)
    {
        return result;
    }

    public override InterceptionResult<object> ScalarExecuting(DbCommand command, CommandEventData eventData, InterceptionResult<object> result)
    {
        return result;
    }

    public override InterceptionResult<DbDataReader> ReaderExecuting(DbCommand command, CommandEventData eventData, InterceptionResult<DbDataReader> result)
    {
        return result;
    }
}

public class B1TEST
{
    /// <summary>
    /// Unicode TEST of NCHAR(4)
    /// </summary>
    public virtual String Mynchar { get; set; }
}

public class OracleContext : DbContext
{
    public DbSet<B1TEST> B1TestRepo { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        string connectionString = "DATA SOURCE=db_connection_string";

        optionsBuilder.UseOracle(connectionString, _ =>
        {
            optionsBuilder.AddInterceptors(new OracleCommandInterceptor());
        }).LogTo(Console.WriteLine, LogLevel.Trace);
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<B1TEST>(entity =>
        {
            entity.HasKey(e => e.Mynchar).HasName("B1TEST_prim_key");
            entity.Property(e => e.Mynchar).IsRequired().IsUnicode(true).HasMaxLength(4).IsFixedLength();
            entity.ToTable("B1TEST");
            entity.Property(e => e.Mynchar).HasColumnName("MYNCHAR");
        });
    }
}

static class Program
{
    static void Main(string[] args)
    {
        var db = new OracleContext();

        string mynchar = "ęłćń";

        var entity = new B1TEST();
        entity.Mynchar = mynchar;
        db.B1TestRepo.Add(entity);
        db.SaveChanges();

        var result = db.B1TestRepo.Where(e => e.Mynchar == mynchar).FirstOrDefault();
        Console.WriteLine(result.Mynchar);
    }
}

csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.EntityFrameworkCore" Version="8.0.0"/>
    <PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="8.0.0"/>
    <PackageReference Include="Oracle.EntityFrameworkCore" Version="8.21.121"/>
    <PackageReference Include="Microsoft.EntityFrameworkCore.Relational" Version="8.0.0"/>
  </ItemGroup>
</Project>
alexkeh commented 8 months ago

@ElAdriano Thanks for reporting this error and your thorough analysis. I've been able to reproduce the error as well and have created a bug (36179160) to have the ODP.NET dev team review and resolve the problem.

alexkeh commented 8 months ago

@ElAdriano After further investigation, the problem in the console sample program is that the encoding was not set to UTF8. If you add the following line to the sample program prior to retrieving the results, you will see the correct data returned.

Console.OutputEncoding = Encoding.UTF8;

This works for both the test cases that use

    var result = db.B1TestRepo.Where(e => e.Mynchar == "ęłćń").FirstOrDefault();

and

var result = db.B1TestRepo.Where(e => e.Mynchar == mynchar).FirstOrDefault();

Does this problem still reproduce for you with another method of retrieval?

ElAdriano commented 8 months ago

@alexkeh Yes, it still occurs. Encoding for console wasn't set, but I think that's just logging stuff. When it comes to fetching the data from db, results are still the same.

// this returns null
var result = db.B1TestRepo.Where(e => e.Mynchar == "ęłćń").FirstOrDefault();

// this works fine
string mynchar = "ęłćń";
var otherResult = db.B1TestRepo.Where(e => e.Mynchar == mynchar).FirstOrDefault();
alexkeh commented 8 months ago

@ElAdriano If I put a watch on on the result, I see it returning "ęłćń" in the code you see null getting returned. I had a colleague test this with his own instance. He also got the returned the valid results.

Do you see this issue also occur with EF Core 7?

What SQL do you see executed in log trace?

ElAdriano commented 8 months ago

@alexkeh It occurs for EF Core 7 as well. In log trace I see that proper query has been sent

Executed query:

SELECT "b"."MYNCHAR"
      FROM "B1TEST" "b"
      WHERE "b"."MYNCHAR" = N'ęłćń'
      FETCH FIRST 1 ROWS ONLY

but unfortunately it still returns null for that one specific case: image

alexkeh commented 8 months ago

@ElAdriano Perhaps the reason for the data mismatch is the DB configuration then. What are your DB's NLS_CHARACTERSET and NLS_NCHAR_CHARACTERSET parameter values?

ElAdriano commented 8 months ago

@alexkeh I see in debugger that NLS configuration for the connection looks like this: image but the bug still occurs, even after changing to proper language and territory: image

Also I think that saving entity to db wouldn't work if it was caused by improper NLS settings.

alexkeh commented 8 months ago

@ElAdriano I don't see the NLS_CHARACTERSET nor NLS_NCHAR_CHARACTERSET values. Can you run the following queries in your DB to return that info? select value from nls_database_parameters where parameter='NLS_NCHAR_CHARACTERSET' select value from nls_database_parameters where parameter='NLS_CHARACTERSET'

ElAdriano commented 8 months ago

I see that those parameters are set as: NLS_NCHAR_CHARACTERSET = "AL16UTF16" NLS_CHARACTERSET = "WE8ISO8859P15"

alexkeh commented 6 months ago

Base bug is 35982927.

phuchautea commented 1 month ago

@ElAdriano @alexkeh

In my case, I encountered a similar issue with MySQL using Vietnamese characters with diacritics, similar to your case with 'ęłćń' in Oracle. Adding charset=utf8mb4 to the ConnectionString resolved the issue, allowing MySQL to handle the characters correctly. I hope this will be helpful to you.

Environment:

alexkeh commented 1 month ago

After some more investigation by the Oracle NLS team, they've concluded that this issue is fundamentally an enhancement request, rather than a bug. ODP.NET would need to implement an NCHAR literal replacement feature.

With that said, this missing feature likely affects a small number of apps and has a simple workaround. If implemented for EF Core, it could have negative performance implications for many apps. Oracle EF Core would need to add U literal or UNISTR function whenever a string literal is used, rather than under specific Unicode conditions. That's a performance cost for every Oracle EF Core app that isn't affected by this issue.

Thus, the Oracle EF Core team has decided not to add this functionality for now. If we see more votes for this enhancement and reasons why the workaround can't be used, we can reconsider.

The existing workaround is to use a parameterized value for the string comparison instead of string literal in the LINQ. That is not hard to implement and a better query practice. Parameterized queries protect against query injection, improve performance, and enhance maintainability.

In EF Core 9, a new feature, EF.Parameter, has been introduced that forces EF to use a parameter even if the value is a literal. For example:

async Task<List<Post>> GetPostsForceParameter(int id)
    => await context.Posts
        .Where(
            e => e.Title == EF.Parameter(".NET Blog") && e.Id == id)
        .ToListAsync();

The SQL translation now contains a parameter for the ".NET Blog" string:

SELECT [p].[Id], [p].[Archived], [p].[AuthorId], [p].[BlogId], [p].[Content], [p].[Discriminator], [p].[PublishedOn], [p].[Title], [p].[PromoText], [p].[Metadata]
      FROM [Posts] AS [p]
      WHERE [p].[Title] = @__p_0 AND [p].[Id] = @__id_1

When Oracle EF Core supports EF Core 9, this will be an option to use string literals in LINQ.