Closed RickXMoore closed 4 years ago
Thanks for the feedback and bringing this to our notice . At this time we are reviewing the feedback and will update the document as appropriate .
We have assigned the issue to the content author . They will evaluate & update the document as appropriate
Going to second this. The current promotion using ARM templates from dev to test to live is painful, especially when integrated with Git and deploying through DevOps
Requirement to deactivate and reactivate triggers. Requirement to tidy up removed linked services yourself - why can't deployments replicate the state of Git and allow for DropIfNotInSource? Why do I need PS to tell my triggers to use test parameters instead of dev? Why can't I just deploy from branches like I can using DACPACs to a SQL database? Why must my master branch not be a reflection of my true 'master' that's in Live and be ahead of it? Can't I just deploy from test and dev integration branches like I can to a database? Where is the DACPAC-style task for deploying to ADF anyway? Automate the generation of ARM templates from a dev branch and allow it to deploy to dev.
Allow the option to ignore certain types of parameters in linked services and triggers during deployment and allow those parameters to be changed outside of Git-Mode if you can't simplify the move from dev to test in ARM template deployments.
Long and short of this is I'm removing Git integration from my Test and Live factories until you can figure this out properly.
Hi Thomas. Did I understand you well? Do you have (had) Git integration enabled for other (than dev) environments, like Test and Live? If so, that's not the way you should use git with ADF. All the other things about an analogy to DACPAC and how that works - I agree and I'm all with you. That's why I started working on open-source code, PowerShell module, to do these things. Take a look: https://github.com/SQLPlayer/azure.datafactory.tools
I couldn't agree more. We have a team of developers all working on multiple feature branches in parallel. We need to be able to publish any or all of those features to our collaboration branch, but selectively move features to higher environments as they are tested/approved. Right now our only option is to work solely in feature branches, and test using debug triggers. Debug triggers are highly unstable, buggy, and monitor logs don't get saved. Woe unto me if I'm using a debug trigger to test a 10 hour procedure and my network drops during that time. I will never be able to restore the monitor output for that run. It's just gone, as is my time and testing results.
I don't understand why I can't discriminate between what gets published to the live factory, and what I can stage for release to higher environments.
I currently manage all of this via PowerShell 7, but my problem is that it’s still a manual process. I’m a SQL DBA with 30+ yrs, but I’m not a developer, so my coding options are rather limited. I can publish a single object or an entire feature thru each environment and never have to deploy the entire datafactory each time. Even with the link in the initial reply, it still deploys the entire datafactory.
We need a CI / CD process that can manage individual objects or sets of objects that can be quickly deployed to any number of environments after those changes have been successfully published to the initial ADF environment whatever that might be in your company.
From: mtvessel notifications@github.com Sent: Thursday, April 23, 2020 1:23 PM To: MicrosoftDocs/azure-docs azure-docs@noreply.github.com Cc: Rick Moore Rick.Moore@cfainstitute.org; Author author@noreply.github.com Subject: Re: [MicrosoftDocs/azure-docs] ADF CI/CD and PowerShell (#43390)
[External Sender]
I couldn't agree more. We have a team of developers all working on multiple feature branches in parallel. We need to be able to publish any or all of those features to our collaboration branch, but selectively move features to higher environments as they are tested/approved. Right now our only option is to work solely in feature branches, and test using debug triggers. Debug triggers are highly unstable, buggy, and monitor logs don't get saved. Woe unto me if I'm using a debug trigger to test a 10 hour procedure and my network drops during that time. I will never be able to restore the monitor output for that run. It's just gone, as is my time and testing results.
I don't understand why I can't discriminate between what gets published to the live factory, and what I can stage for release to higher environments.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/MicrosoftDocs/azure-docs/issues/43390#issuecomment-618531528, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AN2O6Y7NA36NCTGRVOIBPRTROB2ORANCNFSM4JQUR4NQ.
Hi Thomas. Did I understand you well? Do you have (had) Git integration enabled for other (than dev) environments, like Test and Live? If so, that's not the way you should use git with ADF. All the other things about an analogy to DACPAC and how that works - I agree and I'm all with you. That's why I started working on open-source code, PowerShell module, to do these things. Take a look: https://github.com/SQLPlayer/azure.datafactory.tools
Hi,
No, I only use Git on my dev factory. I'm familiar with the need to promote to test and prod using ARM template deployment. However, that doesn't stop it being clunky as hell and fiddly. Disabling triggers, checking against the ARM template for deprecated pipelines etc is just not a good experience. I've even taken to bastardising the Key Vault for environment variables. This product does some great things but in the 'little' things like environments it has so, so far yet to go.
I couldn't agree more. We have a team of developers all working on multiple feature branches in parallel. We need to be able to publish any or all of those features to our collaboration branch, but selectively move features to higher environments as they are tested/approved. Right now our only option is to work solely in feature branches, and test using debug triggers. Debug triggers are highly unstable, buggy, and monitor logs don't get saved. Woe unto me if I'm using a debug trigger to test a 10 hour procedure and my network drops during that time. I will never be able to restore the monitor output for that run. It's just gone, as is my time and testing results.
I don't understand why I can't discriminate between what gets published to the live factory, and what I can stage for release to higher environments.
IMO the quickest win on this is to permit publish to ADF from more than just the compare branch. So you could have a 'test' branch that can publish to your ADF Test (and you should be able to designate the 'publish' branch per ADF). That with environment variables would solve a lot grief.
Also, does anyone want to explain to me what the hell the point is in adf_publish if it's not supposed to go out of sync with master anyway?
Being able to publish from multiple branches might help, but I'd still have multiple developers who need to publish their feature branches to the live factory, without fear that every feature is now going to be promoted to the next environment.
On Thu, Apr 23, 2020 at 4:11 PM Thomas-Bailey notifications@github.com wrote:
I couldn't agree more. We have a team of developers all working on multiple feature branches in parallel. We need to be able to publish any or all of those features to our collaboration branch, but selectively move features to higher environments as they are tested/approved. Right now our only option is to work solely in feature branches, and test using debug triggers. Debug triggers are highly unstable, buggy, and monitor logs don't get saved. Woe unto me if I'm using a debug trigger to test a 10 hour procedure and my network drops during that time. I will never be able to restore the monitor output for that run. It's just gone, as is my time and testing results.
I don't understand why I can't discriminate between what gets published to the live factory, and what I can stage for release to higher environments.
IMO the quickest win on this is to permit publish to ADF from more than just the compare branch. So you could have a 'test' branch that can publish to your ADF Test (and you should be able to designate the 'publish' branch per ADF). That with environment variables would solve a lot grief.
Also, does anyone want to explain to me what the hell the point is in adf_publish if it's not supposed to go out of sync with master anyway?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MicrosoftDocs/azure-docs/issues/43390#issuecomment-618639055, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFSQZJKXVCHD7LYV3HZTYDROCOG5ANCNFSM4JQUR4NQ .
-- Michael Taubman email: mtvessel85@gmail.com mobile: 917-670-3945
Hi all,
This is Dan from the product group. Just wanted to say I am monitoring this discussion and really appreciating all of your points. Please continue making suggestions and we are brainstorming ways to make things better!
Thanks, Dan
Purely brainstorming, what would peoples thoughts be on something like the ability to exclude resources from getting published (pardon the mspaint) ? This is just for selectively deciding what to promote from the collaboration branch into the live factory
Hi Dan,
Thanks for the suggestion, but an option to excluding from publishing to the live factory kind of misses the point. We already do that by requiring pull reqs into master. If I want to prevent something from being published I just don't approve the PR. I need to be able to publish whatever I want to the live factory, but selectively choose what goes into the ARM template that CI/CD will use to promote to the next environment. In other words, I need to be able to selectively craft the ARM template for each environment.
Thanks, Mike
On Thu, Apr 23, 2020 at 7:52 PM Daniel Perlovsky notifications@github.com wrote:
Purely brainstorming, what would peoples thoughts be on something like the ability to exclude resources from getting published (pardon the mspaint) ? This is just for selectively deciding what to promote from the collaboration branch into the live factory
[image: image] https://user-images.githubusercontent.com/31044028/80160265-0b4c1680-8582-11ea-9c70-ca859592ca0e.png
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MicrosoftDocs/azure-docs/issues/43390#issuecomment-618728505, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFSQZMWSYUWS2JK6JAL4TDRODIFRANCNFSM4JQUR4NQ .
-- Michael Taubman email: mtvessel85@gmail.com mobile: 917-670-3945
Hey Dan. Appreciate you follow this thread.
Agree with @mtvessel. "Exclude" property shouldn't be present in ADF itself. It should be an option for deployment. Please take a look at how it works for database projects (DACPAC). When you compile database project with SSDT in VS - there is a DACPAC file as a output. This is your BUILD Image.
Then with sqlpackage.exe
application, you can deploy to selected target (SQL) server by comparing DACPAC (image) to target server and generate a differencial script. sqlpackage
accepts plenty of parameters, including options what should be excluded or ignored when generating the script.
The result of this is T-SQL script to be executed on target server. Analogically, here it would be ARM template.
And this is acctually what I'm planning to implement in azure.datafactory.tools.
Thanks for responding so quickly! The need is clear and hopefully we can think of some ways for this be well integrated within the tool itself.
Regarding the initoal post:
Please keep the discussion going here! As the specific doc issues regarding outdated powershell instructions have been fixed, would it be alright if I close this issue?
Thanks, Daniel
Hi Dan,
Let me start off by saying that I’m glad someone from Microsoft directly responded to the post. I’d like to ask for clarification around some of your responses, but in general I’m OK with the issue itself as being closed.
• I’m certainly glad to hear that you are “thinking” of ways to improve, but it’s obvious to me that your customers have varying degrees of how they approach this and some of the complexities that this brings. I think a more open dialog needs to occur between MS and there customers on the needs behind CI / CD deployments and not for MS to just design another solution that doesn’t solve the problems that we face. • I’d like to understand what cmdlets you’re referring to around the pre / post deployment. I’m specifically talking about the AzureDataFactory module and the cmdlets included i.e. xxx-AzDataFactoryV2Pipeline, Dataset, trigger. • I’m sure your customers will continue to do so. • Determine a way that changes to ADF are more broadly distributed, especially around major feature releases like Data Flow so that your customers can be proactive in validating existing processes before they stop working in production environments. As I said at the beginning of the original post, I was asked to post this by my MS account rep and continue to work directly with them on a resolution to this issue.
Thanks again for your continued responses,
Rick.
Hey Rick,
First off sorry the delay in response initially, it shouldn't have taken this long to get this strong dialog going between everyone on this thread.
Regarding 'I’d like to understand what cmdlets you’re referring to around the pre / post deployment. I’m specifically talking about the AzureDataFactory module and the cmdlets included i.e. xxx-AzDataFactoryV2Pipeline, Dataset, trigger.' I was referring to the script shared on our ci/cd doc page that we recommend users running before and after deployments. Our Powershell module should be completely up-to-date. Data Flows were not added until they were generally available in November I beleve
In terms of updates, we actively post new releases and upcoming features on the following forums:
As i said earlier, keep the discussion going! I will go ahead and close the issue as there are no outstanding doc items
Another thing on upcoming features, as ADF (and now data flows as well) are GA, you can expect there will be no breaking changes to existing pipelines without months and months of warning
Hi Dan,
That's not what I was referring to when discussing the issues with PowerShell between the product teams, but that's OK. I'm sure that needed the update as well.
We have experienced the same problem.
What we have done to fix this issue is by no longer merging new feature work into the master branch. All new features would be merged into a "Release" topic branch, which has the same branch policies as master.
For CI/CD, the Azure Release Pipeline in DevOps points to the release branch that we created to deploy.
Once the release has gone into production, then the release branch will need to be merged into the master branch.
We have experienced the same problem.
What we have done to fix this issue is by no longer merging new feature work into the master branch. All new features would be merged into a "Release" topic branch, which has the same branch policies as master.
For CI/CD, the Azure Release Pipeline in DevOps points to the release branch that we created to deploy.
Once the release has gone into production, then the release branch will need to be merged into the master branch.
Ok, but that would suggest you have an ADF that is git enabled but never gets a publish right? Are you ARM template deploying over it or does it never match a branch? How are you handling the param files that are generated with the publish, are you manually crafting these?
Let me start off by saying that I was asked by my Microsoft rep to write this up and post it here in the hope that it might benefit others using ADF. My biggest complaint with this is Microsoft’s stance on how deployments are handled. There are a lot of organizations including mine that are in no position to have every object migrated to each data factory for a small number of changes. I have a DevOPs repository attached to development and used the publish process to migrate from GIT to DF and then used the process of exporting and importing the ARM template to migrate to our newly configured test environment. The process ignored 10-15 objects out of approximately 300 without any indication of errors. If it wasn’t for the fact that I have everything broken down into specific folders, I would have had a difficult time determining which objects failed to migrate. I located the missing objects and deployed them via PowerShell. The statement below under unsupported features leads you to believe that cherry picking is a difficult process that can’t be managed, which is entirely false. • “By design, ADF does not allow cherry-picking commits or selective publishing of resources. Publishes will include all changes made in the data factory o Data factory entities depend on each other, for instance, triggers depend on pipelines, pipelines depend on datasets and other pipelines, etc. Selective publishing of a subset of resources may lead to unexpected behaviors and errors I manage every deployment I do from test to stage to production using PowerShell scripts to create data sets, pipelines and triggers, all associated together. This process works great except that the PowerShell commands and JSON files generated by GIT aren’t in some cases what ADF is expecting to be passed. For example, there are cases when the JSON file through GIT lists the property under AzureSqlTable as schema instead of structure. ADF imports via PowerShell fail due to the schema and table being separated into different fields, or when using queries when generating the source data and it fails due to the absence of the tableName property. Once you know how to get around these and other issues, it’s quite simple to manage deploying directly from PowerShell. Even their own automated CI/CD process has manual steps that must be executed via PowerShell which shows the process is incomplete and inadequate. I’ll leave it with the following issues that I believe must be corrected / enhanced for Azure Data Factory to rise to the level expected by their customers.
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.