dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.68k stars 3.17k forks source link

Discussion for possible implementation of a Snowflake database provider #15732

Closed jzabroski closed 1 year ago

jzabroski commented 5 years ago

I figure this would be a good way for me to learn more about EFCore internals and then go on to help improve EFCore metadata.

smitpatel commented 5 years ago

https://www.snowflake.com/

jzabroski commented 5 years ago

J/W: Why the confused look? My understanding is snowflake is built by some ex-SQL Server guys.

https://github.com/snowflakedb/snowflake-connector-net

ajcvickers commented 5 years ago

@jzabroski Why did you open an issue here? To get guidance from the EF team? Or because you are requesting that this provider is something we ship?

jzabroski commented 5 years ago

To coordinate in case it is already in the works. My free time is valuable so if I am going to give this a shot, why would I work on it in secret?

ajcvickers commented 5 years ago

@jzabroski Thanks for clarifying. I'll make this a discussion issue and rename it so it doesn't look like a feature request.

jzabroski commented 5 years ago

@ajcvickers Some quick thoughts based on https://www.snowflake.com/blog/breaking-the-dev-test-deployment-dollars-cycle/

I'll add more in the coming weeks. This is not my day job!

jzabroski commented 5 years ago

Hey @ajcvickers , @smitpatel So, did some prototyping with just the snowflake connector w/o EF Core. The main issue with Snowflake is it only supports one SQL statement per batch. Do you see any obvious functionality in Entity Framework Core where this would be a showstopper?

Snowflake just added stored procedure support, so in the coming months as they are rapidly adding features, this will be less of an issue.

smitpatel commented 5 years ago

New query pipeline will always send 1 SQL for 1 query executed so it should not be issue there. Probably update pipeline may have issues due to batching EF Core does. cc: @AndriySvyryd

AndriySvyryd commented 5 years ago

By default the update pipeline only sends one command per batch unless the provider overrides this.

jzabroski commented 5 years ago

Thanks, guys! Here's the messy / interesting part of update pipeline. Snowflake does not have a scope_identity() operator. Instead, it has first-class time-travel, so such operations are implemented through the time traveling API. See: https://stackoverflow.com/questions/53837950/get-identity-of-row-inserted-in-snowflake-datawarehouse/53903693#53903693

See also this thread which outlines another approach: https://support.snowflake.net/s/question/0D50Z00008hySbMSAU/how-can-i-identify-the-row-that-i-just-inserted-via-an-autoincrementing-column

jzabroski commented 5 years ago

Reached out to Marcin Zukowski at Snowflake to see if there is any progress here; awaiting his reply.

buvinghausen commented 4 years ago

@jzabroski any update on this? I would be keenly interested in a Snowflake provider for EF as well and similarly not sure if I have the time to personally invest. In my use case all keys are GUIDs and we pre-define them so worrying about identity insert is just not something I would constitute a blocker.

jzabroski commented 4 years ago

@buvinghausen It is not hard to implement a provider. The annoying part is that Snowflake sales has never responded to my emails about this blocker. (Note: Marcin Zukowski at Snowflake suggested I reach out to sales as the right way to push this blocker through as a feature / bug fix. Sales lack of reply to me suggests this will never get done.)

identity insert is just not something I would constitute a blocker.

Just because something works for you in a narrow use case does not mean it's not a blocker for general use. I am already helping with two open source projects and you really wouldn't believe all the targeting scenarios people have for their deployments and expect the maintainers to figure out their crazy deployments for them. Not complaining, just saying the incremental cost of explaining to people "THIS DOES NOT WORK DON'T DO IT" is a blocking cost for me at this point, because I know there will be 50 of those issues opened and I don't see answering all those issues as part of my life story.

buvinghausen commented 4 years ago

@jzabroski understood perhaps we can convince Snowflake of the merits of this on the dotnet connector repo rather than here on the aspnet EF Core repo. I certainly wouldn't mind contributing to one over there in terms of testing/fixes unfortunately I don't have the time to be able to start from scratch currently.

jzabroski commented 4 years ago

It appears Yang Liu from Snowflake DB has posted a somewhat acceptable solution to retrieving the last value generated from an autoidentity/sequence generator within a session:

Given the following pre-conditions in the database:

create or replace sequence test_seq;

create or replace table EfCoreTableUsingSeq (id int default test_seq.nextval);

The following scaffolding could be generated to make Snowflake able to talk to Entity Framework Core:

begin;

create or replace temporary table getnextval as 
(
  select t_seq.nextval
  from table(getnextval(test_seq)) t_seq
);

insert into EfCoreTableUsingSeq (id) (
  select nextval from getnextval
);

select nextval from getnextval; --return ID value for the row just inserted

drop table getnextval;

commit;

Probably needs some benchmarking to see how well it does, as well as verification the snowflake-dot-net-connector nuget package can accept sending this command block.

abailey7 commented 3 years ago

@jzabroski following. Any additional progress since February on a Snowflake provider?

I'm extremely interested in something like this getting off the ground.

jzabroski commented 3 years ago

@abailey7 I don't personally have the time available to commit to such stuff any time soon. However, the strategy I laid out above with the help of Yang Liu suggests the original blocker (connection may spuriously return incorrect identity insert values) may be resolved.

I frankly love Snowflake but when I reached out their their sales staff they were non-responsive, so they lost their window of opportunity with me. I support Snowflake in my project FluentMigrator, but that's it, and the support is not that complicated.

My personal goals are to contribute something to the .NET Core standard library in the next 6 months, and continue to "keep the lights on" for my two open source projects (FluentMigrator and RazorLight). Longer term, I want to build on top of FluentMigrator a suite of tools similar to DbHammer.

If you're looking for a business partner to sponsor a Snowflake Provider, I'd suggest contacting Jonathan Magnon. He is a great business partner and I think he has the skills to deliver such a feature.

zackhowe commented 3 years ago

@jzabroski:

what about something like: cmd.CommandText = "set var = (select test_seq.nextval);"; cmd.ExecuteNonQuery(); cmd.CommandText = "insert into EfCoreTableUsingSeq(id, data) values ($var, 'test data');"; cmd.ExecuteNonQuery(); cmd.CommandText = "select $var;"; var result = cmd.ExecuteScalar();

vslee commented 2 years ago

You can vote on the issue here: https://github.com/snowflakedb/snowflake-connector-net/issues/426