dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.73k stars 3.17k forks source link

Nullability issues with `unhex()` #33864

Closed ranma42 closed 4 months ago

ranma42 commented 4 months ago

The definition of the Unhex method does not account for the case where the hex() SQLite function returns NULL. This can happen even for non-null inputs, if they contain unexpected characters and/or if the length of the input is not even.

An example program that showcases the bug is:

using System;
using System.Data;
using System.Linq;
using Microsoft.EntityFrameworkCore;

using var db = new BloggingContext();

db.Database.EnsureDeleted();
db.Database.EnsureCreated();

var bad = db.Blogs
    .Select(x => new { x.Hex, Raw = EF.Functions.Unhex(x.Hex) })
    .Where(x => x.Hex != null)
    .ToList();

foreach (var y in bad) {
    Console.WriteLine($"Hex: >{y.Hex}<");
    Console.WriteLine($"Raw: {y.Raw?.Length} bytes\n");
}

public class BloggingContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder options)
        => options
            .LogTo(Console.WriteLine, Microsoft.Extensions.Logging.LogLevel.Information)
            .UseSqlite($"Data Source=test.db");

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Blog>().HasData(new Blog { BlogId = 1, Hex = $"" });
        modelBuilder.Entity<Blog>().HasData(new Blog { BlogId = 2, Hex = $"42" });
        modelBuilder.Entity<Blog>().HasData(new Blog { BlogId = 3, Hex = $"abcdef" });
        modelBuilder.Entity<Blog>().HasData(new Blog { BlogId = 4, Hex = $"foobar" });
        modelBuilder.Entity<Blog>().HasData(new Blog { BlogId = 5, Hex = $"4" });
    }
}

public class Blog
{
    public int BlogId { get; set; }
    public required string Hex { get; set; }
}

Outpur

The program emits the following output:

      SELECT "b"."Hex", unhex("b"."Hex") AS "Raw"
      FROM "Blogs" AS "b"
      WHERE "b"."Hex" IS NOT NULL
Hex: ><
Raw: 0 bytes

Hex: >42<
Raw: 1 bytes

Hex: >abcdef<
Raw: 3 bytes

Hex: >foobar<
Raw:  bytes

Hex: >4<
Raw:  bytes

The last two elements of the bad collection have the Raw field set to null, even though the .Where filter explicitly excluded that case.

Include provider and version information

EF Core version: 8.0.6 Database provider: Microsoft.EntityFrameworkCore.Sqlite Target framework: .NET 8.0 Operating system: Linux (/WSL) IDE: Visual Studio Code 1.89.1

ranma42 commented 4 months ago

The invalid query optimization is caused by a misconfiguration of the nullability in https://github.com/dotnet/efcore/blob/16acc463c92b02bec09993db592ef4216a7a0766/src/EFCore.Sqlite.Core/Query/Internal/Translators/SqliteHexMethodTranslator.cs#L64-L69

I can provide a PR that fixes this (and adds tests ;) ). I am also investigating other similar issues around nullability (starting from misconfiguration of function expressions, but also other null propagation/mis-handling cases).

Would it make sense to open a "meta-issue" to track them together?

roji commented 4 months ago

Breaking change note: https://github.com/dotnet/EntityFramework.Docs/pull/4739