oracle / dotnet-db-samples

.NET code samples for Oracle database developers #OracleDotNet
http://otn.oracle.com/dotnet
MIT License
408 stars 190 forks source link

Guid as keys on Oracle DBMS skip the index and instead do a full table scan #263

Closed andrey-sh closed 1 year ago

andrey-sh commented 1 year ago

When using Guid as keys on Oracle DBMS, Oracle.EntityFrameworkCore 2.21.61. The generated SQL causes the DBMS to always skip the index and instead do a full table scan. SELECT Id FROM MyEntities WHERE Id = N'[someGuid as a string]' Changing the SQL-Query to have HEXTORAW('[someGuid as a string]') can fix the problem.

Related: https://github.com/dotnet/efcore/issues/20643

sungam3r commented 1 year ago

@andrey-sh 50 cents from me - I have read release notes for the latest 3-4 releases and I saw there were some fixes with guids - in sequence so probably working with guids is not a "strong skill" for Oracle provider.

alexkeh commented 1 year ago

It's important to ensure there is no inadvertent Unicode conversion occurring when using/comparing data between the DB layer and EF Core. See ODP.NET EF Core Performance Considerations for more details and tips to avoid the problem.

sergdsh commented 1 year ago

.HasColumnType("RAW(16)") does not help fix the query. The query is built like this: SELECT FROM MyTable WHERE Id = '646464646464646464646464646464' It turns out that the request handler sees the field as a string, although the RAW (16) type is passed. To use an index, it need the HEXTORAW function in the query: SELECT FROM MyTable WHERE Id = HEXTORAW('646464646464646464646464646464')

alexkeh commented 1 year ago

Using the sample test case from https://github.com/dotnet/efcore/issues/20643, if I declare a GUID variable:

Guid g = new Guid("11....");

and execute the following query:

var me = db.MyEntities.Where<MyEntity>(e => e.Id == g).ToListAsync();

I see the following SELECT statement generated:

SELECT "m"."Id" FROM "MyEntities" "m" WHERE "m"."Id" = :g_0

The same thing should be happening with the HEXTORAW conversion, though my method will be easier to use for LINQ.

If the app is telling ODP.NET it wants to perform a string comparison with a RAW column values, then it has to convert the RAWs to strings and ignore that index to conduct the comparison.

The requirement is the data type being passed must "match" the DB data type for a comparison operation not to incur a data type conversion and full table scan cost.

Peter-B- commented 1 year ago

Hello,

I think I'm facing the same issue.

My EF Core query is

FileItems.Where (f => f.FileItemId == new Guid("1b4fa2ed-7662-4c33-87a9-c7c076fca589")).ToListAsync()

here FileItemId is a Guid.

The generated SQL is

SELECT "f"."FILE_ITEM_ID", "f"."FILE_NAME"
FROM "FMS_ADMIN"."FILE_ITEMS" "f"
WHERE "f"."FILE_ITEM_ID" = 'EDA24F1B6276334C87A9C7C076FCA589'
GO

When I put this into the SQL Developer and let it explain the execution plan, I see

image

I think this leads to a full table scan, since the filter predicate applies RAWTOHEX to f.FILE_ITEM_ID before conducting a string comparison. This is especially slow, if I use linguitic comparison (NLS_SORT = 'GERMAN_CI' & NLS_COMP= 'LINGUISTIC'.

When I change the SQL (by hand) to

SELECT "f"."FILE_ITEM_ID", "f"."FILE_NAME"
FROM "FMS_ADMIN"."FILE_ITEMS" "f"
WHERE "f"."FILE_ITEM_ID" = HEXTORAW('EDA24F1B6276334C87A9C7C076FCA589')
GO

or

...
WHERE "f"."FILE_ITEM_ID" = CAST('EDA24F1B6276334C87A9C7C076FCA589' AS RAW(16))

, HEXTORAW is applied to the query string and the index is used in the execution plan.

image

I have the same problem, when using the Contains operator:

var fileIds = new[]{
    new Guid("1b4fa2ed-7662-4c33-87a9-c7c076fca589"), new Guid("efe3e372-de9c-4da7-942b-93023507a315")
};

var fileItems = await 
    FileItems
        .Where(fi => fileIds.Contains(fi.FileItemId))
        .Select(fi => fi.FileItemId)
        .ToListAsync();
SELECT "f"."FILE_ITEM_ID"
FROM "FMS_ADMIN"."FILE_ITEMS" "f"
WHERE "f"."FILE_ITEM_ID" IN ('EDA24F1B6276334C87A9C7C076FCA589', '72E3E3EF9CDEA74D942B93023507A315')
GO

In this case a HEXTORAW or CAST also would allow the usage of an index and make the query independent of linguistic string comparison.

Is there any way to adapt this behavior?

I tried to use an conversion from Guid to byte[], but that did not fix the problem.

I'm using these versions:

Oracle.EntityFrameworkCore: 6.21.61
Oracle.ManagedDataAccess.Core: 3.21.80
Microsoft.EntityFrameworkCore: 6.0.5

Any idea or help is appreciated. Kind regards Peter

alexkeh commented 1 year ago

@Peter-B- Any reason you can't use the workaround I suggested in which you declare a GUID variable, then use that variable in the LINQ?

Peter-B- commented 1 year ago

Hi @alexkeh,

thanks for your prompt reply. Sorry, I didn't get the point of your solution. This indeed works for the first query, but not for the query with .Contains. Is there a way to modify more complex queries to work without string comparison?

In my project I have multiple queries like:

var fileItems = await 
    FileItems
        .Where(fi => fileIds.Contains(fi.FileItemId))
        .Select(fi => fi.FileItemId)
        .ToListAsync();
alexkeh commented 1 year ago

@Peter-B- I'm not aware of any. I'll ask the dev team if they have any suggestions.

Peter-B- commented 10 months ago

Hi @alexkeh,

do you have any update on this issue? I'm coming back to this, since my workaround leads to Stackoverflow exceptions in some cases :-(

It would be great to use .Contains() for Guids.

Kind regards Peter

alexkeh commented 8 months ago

@Peter-B- We found a similar issue (Bug 35974045) recently that we've fixed recently and will first be available in the Oracle EF Core 8 coming out soon. You can try out the new release to see if this issue is resolved for you.

clint-mcc-cognizant commented 8 months ago

@alexkeh - sorry to comment on a closed post - is there a technical preview release for this that we are able to test? We experience this same issue with the .Contains() based mechanism - failing that, an ETA on the actual release?

alexkeh commented 8 months ago

There is no technical preview for Oracle EF Core 8. We plan to release it soon after Microsoft's planned EF Core 8 release on Tuesday. The exact timing depends on when we can complete final testing with the EF Core 8 production version with no significant issues. That could be a week to a month based on what we find.

To track this issue independently, we have a different bug (36003953) created for it.

Peter-B- commented 8 months ago

Hello @alexkeh,

thanks for digging out this issue and notifying me. I will test it as soon as I can update to EF Core 8. Looking forward to it.

alexkeh commented 7 months ago

Bug 36003953 has been fixed with the Oracle EF Core 8.

harishpn commented 7 months ago

Hello @alexkeh, I have tried to test the following scenario with Oracle EF Core 8, but still it is not generating query with RAWTOHEX in the where condition.

Guid g = new Guid("11....");

and execute the following query:

var me = db.MyEntities.Where<MyEntity>(e => e.Id == g).ToListAsync();

I see the following SELECT statement generated:

SELECT "m"."Id" FROM "MyEntities" "m" WHERE "m"."Id" = :g_0

but the following statements are generating SQL with RAWTOHEX var me = db.MyEntities.Where<MyEntity>(e => e.Id == new Guid("11....")).ToListAsync(); or

var lst = new List<Guid>{ new Guid("111...")};
var me = db.MyEntities.Where<MyEntity>(e => lst.Contains(e.Id)).ToListAsync();

your thoughts on this?

alexkeh commented 7 months ago

@harishpn This bug fix addresses SQLs that are using GUIDs as literals. The fix now wraps the GUID literal with HEXTORAW. It shouldn't be using RAWTOHEX in any case.

SQL with GUID parameters did not require any changes as they already are using the optimized execution plan to avoid full table scans.

Peter-B- commented 7 months ago

Hello @alexkeh,

thanks for the new version, but I think the issue us not resolved yet. As @harishpn points out, there are many scenarios where the driver uses string comparison.

var id = new Guid("3fa85f64-5717-4562-b3fc-2c963f66afa6");
var scenario1 =
    FileItems
        .Where(fi => fi.FileItemId == id)
        .ToList();

In this scenario, the Guid is compared to a Binary, which makes Oracle use indices:

-- Region Parameters
-- :id_0='3fa85f64-5717-4562-b3fc-2c963f66afa6' (DbType = Binary)
-- EndRegion
SELECT "f"."FILE_ITEM_ID", "f"."BUCKET_ID", "f"."CREATION_IN_PROGRESS", "f"."DATE_CREATED", "f"."DATE_DELETED", "f"."EXTENSION", "f"."FILE_HASH", "f"."FILE_NAME", "f"."FILE_SIZE", "f"."LOCATION_ID", "f"."THUMB_ERROR"
FROM "FMS_ADMIN"."FILE_ITEMS" "f"
WHERE "f"."FILE_ITEM_ID" = :id_0
GO

However, in other scenarios, it is still treated as a string, which circumvents the indices.

var scenario2 =
    FileItems
        .Where(fi => fi.FileItemId == new Guid("3fa85f64-5717-4562-b3fc-2c963f66afa6"))
        .ToList();

var ids = new List<Guid> { id, Guid.NewGuid() };
var scenario3 =
    FileItems
        .Where(fi => ids.Contains(fi.FileItemId))
        .ToList();
SELECT "f"."FILE_ITEM_ID", "f"."BUCKET_ID", "f"."CREATION_IN_PROGRESS", "f"."DATE_CREATED", "f"."DATE_DELETED", "f"."EXTENSION", "f"."FILE_HASH", "f"."FILE_NAME", "f"."FILE_SIZE", "f"."LOCATION_ID", "f"."THUMB_ERROR"
FROM "FMS_ADMIN"."FILE_ITEMS" "f"
WHERE "f"."FILE_ITEM_ID" = '645FA83F17576245B3FC2C963F66AFA6'
GO

SELECT "f"."FILE_ITEM_ID", "f"."BUCKET_ID", "f"."CREATION_IN_PROGRESS", "f"."DATE_CREATED", "f"."DATE_DELETED", "f"."EXTENSION", "f"."FILE_HASH", "f"."FILE_NAME", "f"."FILE_SIZE", "f"."LOCATION_ID", "f"."THUMB_ERROR"
FROM "FMS_ADMIN"."FILE_ITEMS" "f"
WHERE "f"."FILE_ITEM_ID" IN ('645FA83F17576245B3FC2C963F66AFA6', '5978AA265703F9489F0EE7A647BE2B24')
GO

I think implicit string comparison on Guids should be considered a bug.

Is there a chance that you address this issue in a future release?

Kind regards Peter

alexkeh commented 7 months ago

@Peter-B- Which version of Oracle EF Core are you using? With the new Oracle EF Core 8.21.121, we see different SQL generated than you do, which does not use string comparison.

var id = new Guid("3fa85f64-5717-4562-b3fc-2c963f66afa6");
var fileIds = new[] { new Guid("1b4fa2ed-7662-4c33-87a9-c7c076fca589"), new Guid("efe3e372-de9c-4da7-942b-93023507a315"), id, Guid.NewGuid() };

var fileItems = db1.Blogs
        .Where(fi => fileIds.Contains(fi.Gid))
        .ToList();

Generated query:

SELECT "b"."BlogId", "b"."Gid", "b"."Ver"
      FROM "Blogs" "b"
      WHERE "b"."Gid" IN (HEXTORAW('EDA24F1B6276334C87A9C7C076FCA589'), HEXTORAW('72E3E3EF9CDEA74D942B93023507A315'), HEXTORAW('645FA83F17576245B3FC2C963F66AFA6'), HEXTORAW('3C92EF4CF348584196D5C6B6127B3EE7'))

Here the GUID literals are converted to Binary using HEXTORAW, avoiding string comparisons. We do not see HEXTORAW in the SQL the customer has provided.

Peter-B- commented 7 months ago

Oh sorry. You are absolutely right.

I referenced an old dll from the Debug\net6.0 folder instead of the new Debug\net8.0. 🤦‍♂️

After finally switching to 8.21.121, I receive:

SELECT "f"."FILE_ITEM_ID", "f"."BUCKET_ID", "f"."CREATION_IN_PROGRESS", "f"."DATE_CREATED", "f"."DATE_DELETED", "f"."EXTENSION", "f"."FILE_HASH", "f"."FILE_NAME", "f"."FILE_SIZE", "f"."LOCATION_ID", "f"."THUMB_ERROR"
FROM "FMS_ADMIN"."FILE_ITEMS" "f"
WHERE "f"."FILE_ITEM_ID" IN (HEXTORAW('645FA83F17576245B3FC2C963F66AFA6'), HEXTORAW('A3E4BBC54D49BB478F4FDBD5A696652F'))
GO

Exactly as I hoped I would.

Sorry to bother you and thanks for the update Peter