MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.2k stars 21.35k forks source link

Is Datafactory Data Flow running inside the managed vnet? #61040

Closed ashwinnatty closed 4 years ago

ashwinnatty commented 4 years ago

I was of the understanding that ADF Data flows feature like filter , aggregate, joins don't run inside the managed VNET. Kindly clarify and add that as limitation of true.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

HarithaMaddi-MSFT commented 4 years ago

@ashwinnatty - Thanks for sharing valuable feedback. We are investigating on it and will get back to you soon.

HarithaMaddi-MSFT commented 4 years ago

@ashwinnatty - As per the documentation, with this new feature, you can provision the Azure Integration Runtime in Managed Virtual Network and leverage Private Endpoints to securely connect to supported data stores and it uses private IP address to connect to Azure services. As per my understanding, Managed VNET does not apply to the Dataflow transformations and ADF creates spark clusters to perform these computations. The article states using managed VNET for connecting to ADLS and transformations later does not explicitly state using VNET. This is the functionality of ADF and not a limitation. Please let me know if this does not clarify the question and we will be glad to assist further.

ashwinnatty commented 4 years ago

Hi, Thanks for clarifying that Dataflows are not running inside the managed VNET. The title of the doc emphasizes on transformation of data "Transform data securely using mapping data flows". While I totally understand that data flows are a excellent feature to have, the request is to explicitly mention that it does not run inside a managed VNET.

HarithaMaddi-MSFT commented 4 years ago

@ashwinnatty - Thank you. I am moving it to content author for further review.

@djpmsft - Can you please take a look and share your thoughts?

kromerm commented 4 years ago

The managed VNet Azure IR in ADF does indeed include data flows. Your dataflows will execute inside a private VNET environment when using this feature.

HarithaMaddi-MSFT commented 4 years ago

@ashwinnatty - As @kromerm confirmed, this is not a limitation in Azure Data Factory. I am proceeding to close the issue, please feel free to comment for any more queries.

clagger commented 2 years ago

Whats the final answer now - are Data Flow Activities running in the Managed VNet and should be able to connect to configured private endpoints? If we press the "Test Connection" Button of the Linked Service in the Data Flow it successfully connects to the private endpoint of the SQL DB. But when executing the dataflow in the pipeline we get an error that the FQDN is not reachable. From e.g. Script or Copy ADF tasks, the SQL Private Endpoint is reachable without any issues.

markantares commented 2 years ago

As per clagger, I can test the Linked Service with managed Vnet, and it works fine but when I debug the Data Flow or run it as an activity in ADF, it fails with an IP issue for the region. This suggests that the managed VNet works for everything BUT the Data Flow. Does this mean that Data Flows need the public access firewall rules set to allow the IP range, surely negating the security benefits of private access.

Is there a solution that allows me to secure the connections between the ADF Data FLow and the Azure SQL database?

Bastien-Brd commented 1 year ago

We are running into this same issue, is there any update about this from Microsoft @HarithaMaddi-MSFT ?

jonathanTSG commented 1 year ago

Same issue is there any update about this from Microsoft @HarithaMaddi-MSFT

SaurabhXSeth commented 1 year ago

I have been facing the same issue. Copy Activity can access the Azure SQL server by private end point , however the dataflow is not not able to connect to the same Azure SQL server. What would be the solution for this , without enabling public access firewall?

SaurabhXSeth commented 1 year ago

@HarithaMaddi-MSFT