borisdj / EFCore.BulkExtensions

Entity Framework EF Core efcore Bulk Batch Extensions with BulkCopy in .Net for Insert Update Delete Read (CRUD), Truncate and SaveChanges operations on SQL Server, PostgreSQL, MySQL, SQLite
https://codis.tech/efcorebulk
Other
3.64k stars 581 forks source link

BulkInsert produces sytanx error on PostgreSQL (version 6.x) #1516

Open cervengoc opened 3 months ago

cervengoc commented 3 months ago

Hi, first or all, thank you for all the efforts put into this library.

We're in the middle of trying to migrate our codebase from MSSQL to PostgreSQL, and have some difficoulties around bulk operations.

Before anything else, I'd like to emphasize that we're on .NET version 6, including related framework packages (EFCore, EFCore.Bulkextensions, etc. Namely we're using version 6.8.1 from bulk extensions. Unfortunately this is a hard constraint and we cannot put the huge efforts into upgrading to latest, or even to 7.x.

We have a model similar to this (simplified)

public class Company {
    public int Id { get; set; }
    public int? AddressId { get; set; }
    public Address Address { get; set; }
}

public class Address {
    public int Id { get; set; }
    public string BuildingNumber { get; set; }
    pubilc Coordinates Coordinates { get; set; }
}

public class Coordinates {
    public int SRID { get; set; }
    public decimal X { get; set; }
    public decimal Y { get; set; }
}

Type Coordinates is an owned type by Address, so it maps to columns like coordinates_x and so.

When calling BulkInsert on companies (using IncludeGraph), we get the following query in one of the commands, which produces syntax error at or near "INTO" SQL exception.

await dbContext.BulkInsertAsync(companies, o =>
{
    o.IncludeGraph = true;
}, cancellationToken: cancellationToken);
SELECT k.id, k.buildingnumber, t.id, t.coordinates_srid, t.coordinates_x, t.coordinates_y
FROM (
    INSERT INTO "public"."address" ("buildingnumber", "coordinates_srid", "coordinates_x", "coordinates_y") (SELECT "buildingnumber", "coordinates_srid", "coordinates_x", "coordinates_y" FROM "public"."addressTemp251c599b") ON CONFLICT ("id") DO UPDATE SET "buildingnumber" = EXCLUDED."buildingnumber", "coordinates_srid" = EXCLUDED."coordinates_srid", "coordinates_x" = EXCLUDED."coordinates_x", "coordinates_y" = EXCLUDED."coordinates_y" RETURNING "public"."address"."id", "public"."address"."buildingnumber", "public"."address"."coordinates_srid", "public"."address"."coordinates_x", "public"."address"."coordinates_y"
) AS k
LEFT JOIN (
    SELECT a.id, a.coordinates_srid, a.coordinates_x, a.coordinates_y
    FROM public.address AS a
    INNER JOIN public.address AS a0 ON a.id = a0.id
    WHERE (((a.coordinates_srid IS NOT NULL)) AND ((a.coordinates_x IS NOT NULL))) AND ((a.coordinates_y IS NOT NULL))
) AS t ON k.id = CASE
    WHEN (((t.coordinates_srid IS NOT NULL)) AND ((t.coordinates_x IS NOT NULL))) AND ((t.coordinates_y IS NOT NULL)) THEN t.id
END

Any tips or advices are highly appreciated, thank you in advance.

cervengoc commented 3 months ago

Just as an addition, it seems like the key here is not the IncludeGraph option, but the SetOutputIdentity = true option (which I guess is set implicitly when using IncludeGraph). The issue is reproducible by trying to bulk-insert addresses with SetOutputIdentity = true. I've also tried IgnoreGlobalQueryFilters = true option (found in another, possibly related issue), but it makes no difference. In my case I don't even have a global filter on this table (Address)