lithnet / googleapps-managementagent

Google Workspace Management Agent for MIM 2016
MIT License
12 stars 4 forks source link

Strange deletes on import #78

Closed SirDester closed 10 months ago

SirDester commented 1 year ago

Hello @ryannewington these days we had strange issues with Google connector on our tenant and I just want to ask you what you think about this.

On 10/08/2023 night the Full Import job read (successfully, without any error / timeout) only 400 of the 56726 total groups on the Google tenant and from MIM side the result was quite destructive, because it generated 56326 deletes of CSObjects and relative Metaverse objects. The following imports during the day (we have 4 sync per day) always read only a part of the total objects (4000 the next, 1400 the next, and finally 56726 again) generating a high number of delete / add in all the sync flows.

As you can see from the timings, when the query read only few items the execution time is very little (1 second, 3 seconds...) instead of the normal 3-4 minutes for a full read, but there's no timeout or error that can stop the MIM agent from going on.

These are the logs in the MA-operations file:

8/9/2023 6:12:42 PM: Opening import connection. Page size 100 8/9/2023 6:12:45 PM: Import task started for object type 'group' 8/9/2023 6:12:45 PM: Background full import from Google started 8/9/2023 6:12:45 PM: Requesting group fields: groups(email,id,name,aliases), nextPageToken 8/9/2023 6:12:45 PM: Requesting group settings fields: 8/9/2023 6:12:45 PM: Requesting settings: False 8/9/2023 6:12:45 PM: Requesting members: False 8/9/2023 6:12:45 PM: Regex filter: 8/9/2023 6:15:58 PM: Import task completed successfully for object type 'group'. Duration 00:03:13.1564991 8/9/2023 6:15:58 PM: Closing import connection: Normal 8/9/2023 6:15:58 PM: Cleared delta file


8/9/2023 6:15:58 PM: Operation statistics 8/9/2023 6:15:58 PM: Import objects: 56726 8/9/2023 6:15:58 PM: Operation time: 00:03:13.6123423 8/9/2023 6:15:58 PM: Ops/sec: 292.987


8/10/2023 12:13:51 AM: Opening import connection. Page size 100 8/10/2023 12:13:54 AM: Import task started for object type 'group' 8/10/2023 12:13:54 AM: Background full import from Google started 8/10/2023 12:13:54 AM: Requesting group fields: groups(email,id,name,aliases), nextPageToken 8/10/2023 12:13:54 AM: Requesting group settings fields: 8/10/2023 12:13:54 AM: Requesting settings: False 8/10/2023 12:13:54 AM: Requesting members: False 8/10/2023 12:13:54 AM: Regex filter: 8/10/2023 12:13:56 AM: Import task completed successfully for object type 'group'. Duration 00:00:02.0137493 8/10/2023 12:13:57 AM: Closing import connection: Normal 8/10/2023 12:13:57 AM: Cleared delta file


8/10/2023 12:13:57 AM: Operation statistics 8/10/2023 12:13:57 AM: Import objects: 400 8/10/2023 12:13:57 AM: Operation time: 00:00:02.6839936 8/10/2023 12:13:57 AM: Ops/sec: 149.030


8/10/2023 6:22:22 AM: Opening import connection. Page size 100 8/10/2023 6:22:24 AM: Import task started for object type 'group' 8/10/2023 6:22:24 AM: Background full import from Google started 8/10/2023 6:22:24 AM: Requesting group fields: groups(email,id,name,aliases), nextPageToken 8/10/2023 6:22:24 AM: Requesting group settings fields: 8/10/2023 6:22:24 AM: Requesting settings: False 8/10/2023 6:22:24 AM: Requesting members: False 8/10/2023 6:22:24 AM: Regex filter: 8/10/2023 6:22:40 AM: Import task completed successfully for object type 'group'. Duration 00:00:16.2573441 8/10/2023 6:22:41 AM: Closing import connection: Normal 8/10/2023 6:22:41 AM: Cleared delta file


8/10/2023 6:22:41 AM: Operation statistics 8/10/2023 6:22:41 AM: Import objects: 4000 8/10/2023 6:22:41 AM: Operation time: 00:00:16.8763037 8/10/2023 6:22:41 AM: Ops/sec: 237.018


8/10/2023 7:51:19 PM: Opening import connection. Page size 100 8/10/2023 7:51:22 PM: Import task started for object type 'group' 8/10/2023 7:51:22 PM: Requesting group fields: groups(email,id,name,aliases), nextPageToken 8/10/2023 7:51:22 PM: Requesting group settings fields: 8/10/2023 7:51:22 PM: Requesting settings: False 8/10/2023 7:51:22 PM: Requesting members: False 8/10/2023 7:51:22 PM: Regex filter: 8/10/2023 7:51:22 PM: Background full import from Google started 8/10/2023 7:51:28 PM: Import task completed successfully for object type 'group'. Duration 00:00:06.0704152 8/10/2023 7:51:28 PM: Closing import connection: Normal 8/10/2023 7:51:28 PM: Cleared delta file


8/10/2023 7:51:28 PM: Operation statistics 8/10/2023 7:51:28 PM: Import objects: 1400 8/10/2023 7:51:28 PM: Operation time: 00:00:06.4163280 8/10/2023 7:51:28 PM: Ops/sec: 218.192


8/11/2023 12:45:45 AM: Opening import connection. Page size 100 8/11/2023 12:45:48 AM: Import task started for object type 'group' 8/11/2023 12:45:48 AM: Background full import from Google started 8/11/2023 12:45:48 AM: Requesting group fields: groups(email,id,name,aliases), nextPageToken 8/11/2023 12:45:48 AM: Requesting group settings fields: 8/11/2023 12:45:48 AM: Requesting settings: False 8/11/2023 12:45:48 AM: Requesting members: False 8/11/2023 12:45:48 AM: Regex filter: 8/11/2023 12:49:28 AM: Import task completed successfully for object type 'group'. Duration 00:03:40.2716159 8/11/2023 12:51:00 AM: Closing import connection: Normal 8/11/2023 12:51:00 AM: Cleared delta file


8/11/2023 12:51:00 AM: Operation statistics 8/11/2023 12:51:00 AM: Import objects: 56726 8/11/2023 12:51:00 AM: Operation time: 00:05:12.0595191 8/11/2023 12:51:00 AM: Ops/sec: 181.779

As I can understand the Lithnet agent is "blind" against deletes, because there's no delta information, and so it can only consider "deleted" if an object is not present anymore on the connected source.

Do you think this can be something they did on Google side ? Did you ever seen such a behaviour ? Because this happened again on 15/08/2023 night and now I'm too afraid of re-enabling the syncs right now.

Thanks for any help Regards. Maurizio

ryannewington commented 1 year ago

Hey @SirDester

This looks very strange. It can only be due to the Google API returning incomplete results, but telling us it was successful.

The Google API is terrible. Things break randomly. It has to be one of the most inconsistent APIs I've ever worked with. We've seen things like this before that happen randomly and can't be reproduced, then just go back to normal.

My only suggestion would be to use something like Lithnet Autosync, which can detect when changed or deleted csobjects are over a certain threshold, and automatically stop the sync engine, preventing the damage from spreading into the metaverse and downstream. It will email you when this "safety fuse" trips so you can go investigate.

https://docs.lithnet.io/autosync-for-mim/configuration/execution-controller-scripts

SirDester commented 1 year ago

Really thanks for the so fast answer.

I imagined that the issue was on Google API side and I totally agree with you, they're really terrible. I'll give it a try to Autosync, the threshold barrier can be a good solution in these cases.

Thanks for the suggestion and for the feedback. Regards. Maurizio

stale[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.