microsoft / coe-starter-kit

Other
767 stars 227 forks source link

[CoE Starter Kit - QUESTION] Can the CoE delete connections and connection references in our environments? #4183

Closed JMG74 closed 2 years ago

JMG74 commented 2 years ago

What is your question?

Hello, between Sunday November 13th 9am (Geneva/Switzerland time) and Monday November 14th at 7am, all connections as well as all connection references using the PowerPlatform's Oracle connector disappeared from all our environments that were using them (6 different env : 2 DEV, 2 UAT, 2 PRD)!

We already recreated all the deleted connexions and connection references to correct all the problems that had been generated and now we have a little more time to try to understand how this could have happened?

At the moment, we are investigating on the "Center of Excellence" (version 2022-SEPT) which is installed on our tenant in a dedicated environment... We make the hypothesis that it could be involved in this "big cleanup" in all environments of our tenant. Do you think this is possible? Does the CoE keep logs in which we could find the trace of these deletions to confirm or not our hypothesis? Could you give us some clues based on the other feedback you collect on the CoE that could help us identify the true cause of our problem ?

Thanks in advance for your help, best regards,

What solution are you experiencing the issue with?

Core

What solution version are you using?

September 2022

What app or flow are you having the issue with?

I don't know

JMG74 commented 2 years ago

Hi Jenefer, any info from your side about my question ?

Jenefer-Monroe commented 2 years ago

Hello. Sorry for the delay, we had some holidays. I'm afraid I'm not following you here. I'm not sure what big cleanup you are referencing. Can you provide a little more detail on the timeline of what happened in your tenant?

JMG74 commented 2 years ago

As mentioned in this ticket, between Sunday November 13th 9am (Geneva/Switzerland time) and Monday November 14th at 7am, all connections (and connection references) using the PowerPlatform's Oracle connector disappeared from all our environments. As we don't know the reason of those deletions we are investigating around to understand how it happened... and this is why I opened this ticket because I'm wondering if the CoE could be involved in those deletions. First of all, could you tell me if the CoE is able to delete connections at some point of its cleanup processes ? If yes, what are the conditions allowing such deletions ? In the CoE, are there logs were I could possibly find traces ? In fact I'm looking for any infos/clues that could put us on the track... Thanks in advance for you help and all the best.

manuelap-msft commented 2 years ago

The Admin | Broken Connection Cleanup flow in the Governance components solution is the only flow in the CoE kit that deletes connections - it's targeted to delete unused and inactive connections that have an "Error" status. You can check if this flow is running and look at the run history of this flow to see if it deleted any of the connections you mentioned.

JMG74 commented 2 years ago

Hi Manuela, thanks for your feedback.

I have checked and discovered that the flow "Admin | Broken Connection Cleanup" has effectively run on Sunday November, 13rd and has deleted our connections to Oracle. Here is an example : image

After more investigations, I discovered that the Oracle user used by the Oracle connector was blocked (another process outside the PowerPlatform has tried to use the same user with the old password and, as a consequence, the Oracle user has been blocked). As the Oracle user used by the connector was blocked, the connector has been detected by the CoE as broken... but it was not really broken because on the Oracle side the user was unblocked the day after... but it was too late; the CoE had already performed the cleanup of all our connections to Oracle on all our environments ! oops !

To avoid such situation, would it be possible to postpone the deletion of broken connectors a given number of days (defined in an environment variable, for example 7 days ) to let us detect and fix the connection problems ?

Looking forward for your feedback and best regards.

manuelap-msft commented 2 years ago

Hello,

thanks for the feedback!

The current logic is this: The flow runs weekly deletes connection references that are errored out and which were last modified at least 30 days ago (configurable).

So in your case, likely the connection has existed and was "working fine" (so no need to modify it) for a long time, then broke and then the flow picked them up for deletion. Unfortunately, the connector doesn't tell us when the connection broke, only when it was last modified (e.g. newly authorized etc). We're limited to what the platform - so the Power Apps for admins connector that we're using for this feature can provide to us.

We are actually removing this flow from the December release, as it's caused a few unintended deletions of connections like yours and was just too risky to keep. We'll think more about how we can introduce this feature again in the future, likely with approval flows or more alerts.

Apologies for the inconvenience caused to your tenant!

JMG74 commented 2 years ago

Thanks for your feedback. I've turned off the flow "Admin | Broken Connection Cleanup" and I'm looking forward for December release.