Open nicholas-peterson opened 7 years ago
I kind of like the idea. The problem is more the design of .NET LINQ, which is mostly meant for select-queries only.
While fetching data is quite standard SQL, the rules of insert/update/delete does vary more. To keep things composable we should use expressions and not direct functions, but how to express those queries with LINQ?
I did already face some issues when I did PR #404. Maybe there should be a separate class for these kind of items, and not to put them under Seq-module.
Let me begin this response by stating I understand how incredibly non-trivial any of this would likely be to implement. Not to mention across multiple database back-ends. I also know about the FSharp.Data.SqlClient which attempts to support this more directly, but it only works for T-SQL.
Below are some suggestions of syntax in linq to support some generic insert/update/delete operations. I'm uncertain of what is possible to accomplish inside computation expressions so some of the later suggestions may be entirely unworkable. The goal is support 'just-enough' syntax to allow some common set-oriented scenarios to not require explicitly created stored procs/User defined functions. Obviously as more exotic operations come into play eventually one will have to drop into straight SQL, but the current behavior of the type provider pretty much precludes set operations. Which means anything outside basic one-record crud statements will become slow at scale.
query {
for employee in context.dbo.Employees do
join skill in context.dbo.Skills on (employee.EmpId = skill.EmpId)
groupBy (employee.EmpId) into employeeSkills
select count
update (fun e ->
employee.TotalSkills <- e)
}
would transform into
UPDATE employee
SET employee.TotalSkills = COUNT(*)
FROM dbo.Employees AS employee
INNER JOIN dbo.Skills AS skill
ON employee.EmpId = skill.EmpId
GROUP BY employee.EmpId
The value returned from executing this statement would be the implicit rowcount. So in essence after the query you would issue an |> Seq.toList |> ignore
if you don't care about the result.
Something like an insert would look like this:
query {
for employee in context.dbo.Employees do
where (employee.HireDate < DateTime.Now.AddYears(-1)
select employee
insert vestedEmployee in context.dbo.VestedEmployees (fun e ->
vestedEmployee.EmpId <- e.EmpId
vestedEmployee.HireDate <- e.HireDate
vestedEmployee.RecordCreatedDate <- DateTime.Now
)
}
would transform into
INSERT INTO dbo.VestedEmployees (EmpId, HireDate, RecordCreatedDate)
SELECT EmpId, HireDate, '2017-05-19'
FROM dbo.Employees AS employee
WHERE employee.HireDate < DATEADD(year,-1,GETDATE())
Theoretically, an insert could be based off a derived sequence:
query {
for employee in [("1234",DateTime.Now.AddYears(-1), DateTime.Now); ("1235", DateTime.Now.AddYears(-2), DateTime.Now)] do
select employee
insert vestedEmployee in context.dbo.VestedEmployees (fun e ->
let empId, hireDate, currentDate = e
vestedEmployee.EmpId <- empId
vestedEmployee.HireDate <- hireDate
vestedEmployee.RecordCreatedDate <- currentDate)
}
would transform into
INSERT INTO dbo.VestedEmployees (EmpId, HireDate, RecordCreatedDate)
VALUES
('1234', DATEADD(year,-1,GETDATE(), GETDATE()),
('1235', DATEADD(year,-2,GETDATE(), GETDATE())
DELETE would follow a similar pattern/strategy.
Sometimes queries need other DML statements to happen server-side beforehand:
query {
let mutable hireDateCutoff = DateTime.Now.AddYears(-1)
for x in dbo.Employees do
where (x.HireDate < hireDateCutoff)
select x
}
transforms into:
DECLARE @hireDateCutoff Datetime = DATEADD(year, -1, GETDATE());
SELECT *
FROM dbo.Employees AS x
WHERE x.HireDate < @hireDateCutoff
Sometimes you need to issue two queries in sequence to get two different values.
query {
let mutable totalEmployees = 0
let mutable totalSkills = 0
for x in dbo.Employees do
select
count
(fun y ->
totalEmployees <- y)
for x in dbo.Skills do
select
count
(fun y ->
totalSkills <- y)
for tally in [totalEmployees;totalSkills] do
select tally
} Seq.toList
would transform into:
DECLARE @totalEmployees INT = 0;
DECLARE @totalSkills INT = 0;
SET @totalEmployees = SELECT COUNT(*) FROM dbo.Employees;
SET @totalSkills = SELECT COUNT(*) FROM dbo.Skills;
SELECT @totalEmployees, @totalSkills;
Theoretically one could also extend these clauses (insert/update/delete) with a subsequent 'output' clause to assign the output of the query to a declared variable (like a table variable). The goal of this would be to enable subsequent use in another statement, or perhaps to return identity values from an operation.
I noticed recently some issues/pr's in a similar vein, namely one relating to unrestricted deletes, and another about comparing two columns both on the query side (rather than the left being an F# value).
I'd like to be able to do something equivalent to the following:
or, more involving:
edit for some other examples:
Basically, I want to push as much of the operational semantics to the SQL layer as possible, influencing the query only with some values from F#, mostly to constrain joins/where/group/having clauses. The goal is to stop having to produce UDF/PROC's in native T-SQL in order to perform set based updates/inserts.
Is something like this already possible? Is it antithetical to the intent of this type provider?