mdrakiburrahman / rakirahman.me

💻Personal blog powered by Gatsby
https://www.rakirahman.me
MIT License
1 stars 1 forks source link

SQL Blog #6

Closed iamhasib closed 3 years ago

iamhasib commented 3 years ago

Hi -

Hasib from Sydney, a Data Architect with Deloitte. When I design a solution for client, I read lots and lots of blog to ensure my ideas are refreshed with what's new out there. I came across this post of your blog today - https://www.rakirahman.me/sql-replication-to-azure/.

Of all the blogs I visited and I visited a lot, yours is by far the best in terms of presenting the technical content. Very good. Also, you are very unique in the way you presented some diagram. A question about the above blog,

In my current use-case, I need to have 10 tables (Azure SQL MI) synced with 10 source tables hosted on-prem (SQL Server 2016). It seemed replication is (Option 3 on your blog) is the best and easiest option, but it's in preview - https://docs.microsoft.com/en-us/azure/azure-sql/managed-instance/replication-transactional-overview

Other options looked bit complex for my use cases (only few tables replication) ..ADF doesn't seem like easier option either. Your thoughts?

mdrakiburrahman commented 3 years ago

Hi @iamhasib,

Thank you for your note, and for the kind comments.

Note that even though "Transactional Replication" is listed as "Preview" for SQL MI - the way I like to think about this is more from a Support Lifecycle Perspective, rather than a Feature Parity perspective. What I mean by that, is the SQL MI Engine (which runs SQL Server Executables - just managed), it supports Transaction Replication the same way SQL Server does. And Transaction Replication is a very mature capability baked into the SQL Server Engine.

For your use case, it may be that Trx Replication will be the easiest way to implement sync from a SQL Server 2016.

ADF will "work" (the same way SSIS will - it's just ETL) - but the challenge will become in how you build your ETL pipelines, this will involve identifying Watermak column and incremental sync around that.

As an alternative to building these ETL pipelines out, you might even consider a CDC Engine like Debezium that supports SQL Server as a Source, then use Event Hub (i.e. Kafka) to store the CDC logs - which you can persist to SQL MI as well.

Transaction Replication is easier than both methods above since the Agents take care of the complexities for you.

iamhasib commented 3 years ago

Salaam Raki (Bhai) -

Firstly, a quick question about something different. I checked some other post on your blog site. Did you draw your diagram and artwork that you published on your page? Which tool you used to draw those diagrams, specially those with code snippet.

Now about your solution. I comment on the last one first. Event driven one. I would love to do that but wouldn't you agree that would be a massive project. I ran CDC project and they are complex. However you simplify it, CDC is always complicated.

As of Replication based, even cost wise, it's cheaper (than ADF). I saw a video about ADF pricing, man...this is major ....Back to replication, I am not sure if I fully understood as to what you were saying on Preview. Does feature in preview means it's not available in AU region?

I look forward to hearing from you soon.