Open Radamonas opened 1 year ago
SQL2000!!!??? wow! Looks likes somebody has to load data from a veeeery old application! Well, to speak franckly, I didn't plan anything else than efcore and sql client to access sql server database (I actually thought that it was possible with SQL Client). If you have any .NET library that permits to access this database: 3 options:
Select
, CrossApply
or Do
operators. ResolveAndSelect
or ResolveAndDo
Operators. For the CrossApply
, there is a version of that has a pushValues
having a IExecutionContext
as a second parameter (this contains a DependencyResolver
property). CrossApply
is described here: https://paillave.github.io/Etl.Net/docs/recipes/getManyDataCrossApply
, you will have access IExecutionContext.DependencyResolver
, so that you can count on some connection information for example.You can look at the extensions for native SqlServer if you want some inspiration. FYI, making custom extensions is very accessible. https://github.com/paillave/Etl.Net/tree/master/src/Paillave.Etl.SqlServer
So I came as POC with this approach:
using System.Data;
using System.Data.Odbc;
using Paillave.Etl.Core;
using Paillave.Etl.FileSystem;
using Paillave.Etl.TextFile;
using PocEtlDotNet.Entities;
namespace PocEtlDotNet;
class Program
{
static async Task Main(string[] args)
{
var connectionString1 = @"Driver={SQL Server};Server=MyServer;Database=Db1;Trusted_Connection=yes;";
var connectionString2 = @"Driver={SQL Server};Server=MyServer;Database=Db2;Trusted_Connection=yes;";
var processRunner = StreamProcessRunner.Create<string>(DefineProcess);
var executionOptions = new ExecutionOptions<string>
{
Resolver = new SimpleDependencyResolver()
.Register<IDbConnection>(new OdbcConnection(connectionString1), "source1")
.Register<IDbConnection>(new OdbcConnection(connectionString2), "source2"),
UseDetailedTraces = true,
NoExceptionOnError= false,
};
var res = await processRunner.ExecuteAsync("Start output to files", executionOptions);
Console.Write(res.Failed ? "Failed" : "Succeeded");
}
private static void DefineProcess(ISingleStream<string> contextStream)
{
contextStream
.CrossApply<object, MyTableEntity>("strange stuff", (fileValue, dependencyResolver, cancellationToken, push) =>
{
using var connection = dependencyResolver.Resolve<IDbConnection>("source1");
connection.Open();
using var command = new OdbcCommand("select * from dbo.my_table", (OdbcConnection)connection);
using var reader = command.ExecuteReader();
while (reader.Read())
{
var values = new MyTableEntity() { code = reader.GetString(0), name = reader.GetString(1) };
push(values);
}
})
.Do("print", o => Console.WriteLine(o.name))
.Select("create row to save", i => new { i.name, i.code })
.ToTextFileValue("to file", @"C:\temp\out_source1.csv", FlatFileDefinition.Create(f => new { name = f.ToColumn("Name"), code = f.ToColumn("Code") }).IsColumnSeparated('|'))
.WriteToFile("save to file", i => i.Name);
contextStream
.CrossApply<object, MyTableEntity>("strange stuff", (fileValue, dependencyResolver, cancellationToken, push) =>
{
using var connection = dependencyResolver.Resolve<IDbConnection>("source2");
connection.Open();
using var command = new OdbcCommand("select * from dbo.my_table", (OdbcConnection)connection);
using var reader = command.ExecuteReader();
while (reader.Read())
{
var values = new MyTableEntity() { code = reader.GetString(0), name = reader.GetString(1) };
push(values);
}
})
.Do("print", o => Console.WriteLine(o.name))
.Select("create row to save", i => new { i.name, i.code })
.ToTextFileValue("to file", @"C:\temp\out_source2.csv", FlatFileDefinition.Create(f => new { name = f.ToColumn("Name"), code = f.ToColumn("Code") }).IsColumnSeparated('|'))
.WriteToFile("save to file", i => i.Name);
}
}
I assume if I would like to call stored procedures ones which return nothing I should use .Do
?
Are there any guidance regarding calling stored procedures? As it quite common in SSIS routines, but I couldn't find any documentation on this.
What you did is correct.
I remember now that, indeed, at the time of sqlserver 2000, odbc drivers was the recommended way; I'll make an amendment to permit sql extenstions to work with it as an option.
On the current sql server extension, the way to go to execute stored procedure is to use ToSqlCommand
like described here: https://paillave.github.io/Etl.Net/docs/recipes/sqlServer#execute-a-sql-process-for-every-row
note for myself, permit SqlServer extensions to work with OdbcDrivers as well
@Radamonas I just pushed v2.1.3-beta that you will find in pre release. This should permit SqlServer extension to work with any adonet driver (including ODBC). Let me know if it works for you.
I've changed the code to:
using System.Data;
using System.Data.Odbc;
using Paillave.Etl.Core;
using Paillave.Etl.FileSystem;
using Paillave.Etl.SqlServer;
using Paillave.Etl.TextFile;
using PocEtlDotNet.Entities;
namespace PocEtlDotNet;
class Program
{
static async Task Main(string[] args)
{
var connectionString1 = @"Driver={SQL Server};Server=MyServer;Database=Source1;Trusted_Connection=yes;";
var connectionString2 = @"Driver={SQL Server};Server=MyServer;Database=Source2;Trusted_Connection=yes;";
using var conn1 = new OdbcConnection(connectionString1);
using var conn2 = new OdbcConnection(connectionString2);
conn1.Open();
conn2.Open();
var processRunner = StreamProcessRunner.Create<string>(DefineProcess);
var executionOptions = new ExecutionOptions<string>
{
Resolver = new SimpleDependencyResolver()
.Register<IDbConnection>(conn1, "source1")
.Register<IDbConnection>(conn2, "source2"),
UseDetailedTraces = true,
NoExceptionOnError= false,
};
var res = await processRunner.ExecuteAsync("Start output to files", executionOptions);
Console.Write(res.Failed ? "Failed" : "Succeeded");
}
private static void DefineProcess(ISingleStream<string> contextStream)
{
contextStream
.CrossApplySqlServerQuery("select", o => o
.FromQuery("select * from dbo.myTable")
.WithMapping(i => new
{
code = i.ToColumn("code"),
name = i.ToColumn("name")
})
, "source1")
.Do("print", o => Console.WriteLine(o.name))
.Select("create row to save", i => new { i.name, i.code })
.ToTextFileValue("to file", @"C:\temp\out_source1.csv", FlatFileDefinition.Create(f => new { name = f.ToColumn("Name"), code = f.ToColumn("Code") }).IsColumnSeparated('|'))
.WriteToFile("save to file", i => i.Name);
var afd = contextStream
.CrossApplySqlServerQuery("select", o => o
.FromQuery("select * from dbo.myTable")
.WithMapping(i => new
{
code = i.ToColumn("code"),
name = i.ToColumn("name")
})
, "sourece2")
.Do("print", o => Console.WriteLine(o.name))
.Select("create row to save", i => new { i.name, i.code })
.ToTextFileValue("to file", @"C:\temp\out_source2.csv", FlatFileDefinition.Create(f => new { name = f.ToColumn("Name"), code = f.ToColumn("Code") }).IsColumnSeparated('|'))
.WriteToFile("save to file", i => i.Name);
}
}
Tried to call:
contextStream
.Select("Create a value", _ => new
{
code = "CD",
name = "CD name"
})
.SqlServerSave("save to db", o => o.WithConnection("source1").ToTable("dbo.myTable")) ;
Got error:
Paillave.Etl.Core.JobExecutionException
HResult=0x80131500
Message=Job execution failed
Source=Paillave.Etl
StackTrace:
at Paillave.Etl.Core.StreamProcessRunner1.<>c__DisplayClass14_0.<ExecuteAsync>b__3(Task t) at System.Threading.Tasks.ContinuationResultTaskFromTask
1.InnerInvoke() in
...
Am I missing something in this case?
Sorry for this. I will look at it asap. Can you give me the full StackTrace?
Sorry for this. I will look at it asap. Can you give me the full StackTrace?
I think I know what is the problem. It will require a bit of work. As I'm very busy, I don't think I'll be able to solve this today. I will let you know asap.
I have another issue with this version. I tried to get trans id from one source and apply it to the query to another source. Failed, then changed the id to static value. Failed. When second query was changed to use fixed value within query it passed. Bellow is the second option (both are ODBC cconnections):
contextStream
.CrossApplySqlServerQuery("get max", a => a
.FromQuery("SELECT 1000 as ID")
.WithMapping(a => new { trans_id = a.ToColumn<int>("ID") }), "source1")
.Select("build criteria", i => new { TransId = 100 })
.Do("printeris", o => Console.WriteLine(o))
.CrossApplySqlServerQuery("select with last", s => s
.FromQuery("SELECT TOP 10 [trans_id] FROM [dbo].[temp_trans] WHERE [trans_id] <= @TransId")
.WithMapping(a => new { transId = a.ToColumn<int>("trans_id") })
, "source2")
.Do("print", o => Console.WriteLine(o));
Seems it is not resolving parameter @TransId.
Errors in TaskContinuations.cs:
Debug.Assert(m_action != null);
if (m_action is Func<Task, TResult> func)
{
m_result = func(antecedent);
return;
}
Error message:
Paillave.Etl.Core.JobExecutionException
HResult=0x80131500
Message=Job execution failed
Source=Paillave.Etl
StackTrace:
at Paillave.Etl.Core.StreamProcessRunner`1.<>c__DisplayClass14_0.<ExecuteAsync>b__3(Task t)
at System.Threading.Tasks.ContinuationResultTaskFromTask`1.InnerInvoke() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/TaskContinuation.cs:line 88
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs:line 268
--- End of stack trace from previous location ---
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs:line 293
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs:line 2349
--- End of stack trace from previous location ---
at PocEtlDotNet.Program.<Main>d__0.MoveNext() in C:\GIT\OFS\Source\PocEtlDotNet\Program.cs:line 32
at PocEtlDotNet.Program.<Main>(String[] args)
This exception was originally thrown at this call stack:
System.Data.Odbc.OdbcParameterCollection.ValidateType(object)
System.Data.Odbc.OdbcParameterCollection.Add(object)
Paillave.Etl.SqlServer.SqlCommandValueProvider<TIn, TOut>.PushValues(TIn, System.Action<TOut>, System.Threading.CancellationToken, Paillave.Etl.Core.IExecutionContext)
Paillave.Etl.Core.CrossApplyStreamNode<TIn, TOut>.CreateOutputStream.AnonymousMethod__2(System.Action<TOut>, System.Threading.CancellationToken)
Paillave.Etl.Reactive.Core.DeferredPushObservable<T>.InternStart()
Inner Exception 1:
InvalidCastException: The OdbcParameterCollection only accepts non-null OdbcParameter type objects, not SqlParameter objects.
Yes, that makes part of my findings. OleDb and ODBC don't work like pure SQL drivers. I'll make all the necessary amendments.
@paillave are there any updates regarding oledb and odbc support?
Hello @Radamonas, it is still under development. I don't have a lot of free time, but this is still something I'm working on. You will be the first informed once this is done.
I did... but I'm working on a amendment that will make sql queries not depending on the DbConnection type
@paillave, I see it's still marked as help wanted, and I happen to be someone looking for a help wanted sign. If you are still interested in help, can you go over the code changes so far? If it's easier for you to review on a call that works for me, just let me know.
Is there a way to use other db connector except EFCore and SQLClient to access databases? As we have issue using any of the implemented due to the issue, that database we are targeting is SQL 2000.