google / site-kit-wp

Site Kit is a one-stop solution for WordPress users to use everything Google has to offer to make them successful on the web.
https://sitekit.withgoogle.com
Apache License 2.0
1.23k stars 287 forks source link

Schedule a cron task to sync `adSenseLinked` status #8049

Closed tofumatt closed 6 months ago

tofumatt commented 9 months ago

Feature Description

Originally the design doc called for creating a hook that manages and syncs the user's AdSense/Analytics linked accounts in the background periodically, based on the adSenseLinkedLastSyncedAt setting added in #8047. See: https://docs.google.com/document/d/1-JenwPTAkw0eTkmESzjTOfgphs9qpRYH0h3xk_s3a4o/edit#heading=h.77mzdzgqzjn

But after a discussion about this approach, we realised it would be better to entirely handle this synchronisation process on the server-side, using WP Cron.

The functionality here should be similar to the original hook proposed in the design doc: every X minutes the AdSense link status should be refreshed by communicating with the Google API, and then updated if needed in Analytics 4 settings. Whenever a successful sync takes place, the adSenseLinkedLastSyncedAt value should be updated to the current time.


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

Implementation Brief

Test Coverage

QA Brief (ENG QA)

Please take note, this ticket requires ENG based QA

Changelog entry

eugene-manuilov commented 9 months ago

@zutigrm

  • Add useMount hook and ...

Why not to use useEffect? I don't see why we need to use useMount.

  • Use isModuleConnected selector on CORE_MODULES store to check if analytics-4 and adsense modules are connected. Assign value to the new variable for example modulesConnected

Return early if this is not true, right? Let's make it clear in IB.

  • Add useInterval from react-use

Why not just setInterval? Is there any benefit in using userInterval here?

zutigrm commented 9 months ago

@eugene-manuilov

Why not to use useEffect? I don't see why we need to use useMount.

It is more straightforward and has clearer intent, for handling simple one time operation. We don't need need to check for dependencies

Why not just setInterval? Is there any benefit in using userInterval here?

useInterval is more automated and straightforward, as it handles clearing of interval as well, instead of managing it manually, and clearing it in useEffect hook, it just adds unnecessary complexity. I didn't see reason/benefit of suggesting it, vs simpler/cleaner approach of using useMount and useInterval.

Return early if this is not true, right? Let's make it clear in IB.

Thanks, I updated that part.

eugene-manuilov commented 8 months ago

@zutigrm

  • Add includes/Modules/Analytics_4/Synchronize_AdSenseLinked.php class

Where and how we should instantiate and register this class?

  • Include AdSense instance to the dependencies and assign it to appropriate class property

This one is more complicated than in the referenced example because the Analytics_4 module doesn't have a reference to the AnSense module instance. We need to figure out how to make it work without instantiating a new instance of the AdSense module in the Analytics_4 module.

zutigrm commented 8 months ago

@eugene-manuilov Oh I seem to haven't add it. Idea is to instantiate it in Plugin class, and pass modules using $modules->get_module('analytics-4') and $modules->get_module('adsense'). What do you think? Since that is the entry point where we have all modules instances

eugene-manuilov commented 8 months ago

Idea is to instantiate it in Plugin class, and pass modules using $modules->get_module('analytics-4') and $modules->get_module('adsense'). What do you think?

@zutigrm, the Synchronize_AdSenseLinked is part of the Analytics_4 module and must be initialized and registered in the Analytics_4 class. Passing an instance of the AdSense module into the Analytics_4 module doesn't look correct to me because it adds a hard dependency. Unfortunately, there is no beautiful way to implement it at the moment and we will need to figure out how to make it work and not add tough coupling.

Adding this functionality to the Modules also doesn't seem to be the right approach because there are already a lot of things that shouldn't be there. But there is the Modules_Registry class that we can probably use to solve the access from one module to another issue. We can probably move functionality that is responsible for instantiating modules from the Modules class to the registry and make it to be a dependency for the Modules class. Then we can pass module registry into modules classes as a dependency so modules can access other modules without bringing tough coupling.

What do you think? I would also like to hear thoughts from @techanvil and @aaemnnosttv.

techanvil commented 8 months ago

@eugene-manuilov, refactoring Modules_Registry and passing it into the individual modules as a way to allow inter-module access is certainly worth considering.

It's worth noting we already have the googlesitekit_init action which runs once all modules have been setup.

https://github.com/google/site-kit-wp/blob/9dd9a04b7381fb3e6b4b1209e3263667b67e68de/includes/Plugin.php#L226-L231

Maybe we could pass the Modules or a refactored Modules_Registry instance into the action hook (or maybe store them in Plugin and pass the plugin instance instead) and use this as a mechanism for inter-module access.

Or we could keep everything at the hook level - note how we do have this googlesitekit_analytics_adsense_linked filter as way of signalling AdSense module availability to Analytics (although this is now legacy code).

https://github.com/google/site-kit-wp/blob/9dd9a04b7381fb3e6b4b1209e3263667b67e68de/includes/Modules/AdSense.php#L138-L143

https://github.com/google/site-kit-wp/blob/9dd9a04b7381fb3e6b4b1209e3263667b67e68de/includes/Modules/Analytics.php#L106-L111

I think a more deliberate mechanism for allowing inter-module access would be a bit of an improvement over this approach using hooks, which can be a little hard to follow. Personally I would lean toward passing modules/the plugin instance to googlesitekit_init as that sounds like it would generally be quite useful, without allowing such freeform inter-module calls that allowing modules to directly call each other at any point in the lifecycle would enable, which could inadvertently encourage somewhat tangled code over time.

zutigrm commented 8 months ago

@techanvil @eugene-manuilov Thanks for the input.

It seems pretty useful to apply the approach @techanvil suggested by adding param to the googlesitekit_init, by passing Modules instance (it seems more efficient then Plugin instance, which would need additional methods to expose things we might need, vs just passing instances as parameters to the action, like $modules which is already available)

aaemnnosttv commented 8 months ago

I'd prefer not to change the registry or our init hook here. These things could be worth doing but I don't think this situation really requires it.

As far as I can tell, we only want to be able to check the connected status of the AdSense module, and its clientID to check against the list of adsense links for the connected GA4 property.

We have a few existing cases of using filters for exposing specific data from the Modules class. See https://github.com/google/site-kit-wp/blob/b1fc4ee268ee5ebd7168080baabacb115ba33ffb/includes/Core/Modules/Modules.php#L304-L320

We could do the same for checking the connected state as well, which is also better than checking is_connected on the module instance itself since this method doesn't consider whether or not the module is even active.

As for reading the setting from another module, there are cases where we have done this by instantiating a settings instance using the module's own Options instance. This isn't super clean either but is something we've done between GA4 and GA.

Another option that comes to mind we could use would be an internal REST request since the AdSense module already exposes the setting values this way (example). I would probably opt for the previous and instantiate an AdSense\Settings instead since we're just getting a single sub-setting.

Thinking a bit more about how we might give modules access to other instances a bit more cleanly, we could introduce a new interface that would define a method for receiving an instance of another module. E.g.

interface Module_With_Module_AdSense {
    public function set_module_adsense( AdSense $adsense );
}
trait Module_With_Module_AdSense_Trait {
    protected $module_adsense;
    public function set_module_adsense( AdSense $adsense ) {
        $this->module_adsense = $adsense;
    }
}

Then Modules could be updated to provide/inject the instances to module instances that implement the respective interface when it bootstraps (similar to how it handles other Module_With_* contracts).


It's worth noting that these problems wouldn't exist with the original client-driven approach. I'm not saying we need to go that route, but it is worth potentially reconsidering. The other benefit of the client approach is that it is more on-demand and wouldn't invoke additional requests for the sync except when relevant (on the dashboard). I see we're only using the cron as single events, likely triggered by admin_init like we did with Synchronize_Property so it wouldn't be running all the time if the site/SK weren't being used which is my main concern with this.

zutigrm commented 8 months ago

Thanks @aaemnnosttv for your thoughts. It does seem that most efficient is as you noted - to go back to original client approach, instead of introducing complexity in the infrastructure when it can be more straightforward. If we are insisting on backend solution, filter approach that you included could be the least "expensive" option.

Should we go back to the original client approach, what are your thoughts? @techanvil @eugene-manuilov

techanvil commented 8 months ago

Hmm... it really could go either way, couldn't it.

It's worth noting that we originally switched from the client-side approach as a result of a Slack conversation where we identified that we would ideally want some additional client-side changes to avoid having to call the module-specific hook in a common component like DashboardMainApp.

So the implication is that we'll end up making some additional changes either on the server or client side to support this, although in both cases the changes should be generally useful as time goes on.

I guess thinking about the UX, the client-side approach could have the benefit of ensuring the change in adSenseLinked is reflected for the current page load rather than needing to wait for the next, depending on how we subsequently use the setting, although it's not a given.

If we stick with the server-side approach, though - I think we can simplify things further, we shouldn't need the adSenseLinkedLastSyncedAt setting and can just use similar logic to Synchronize_Property to schedule the sync at the desired minimum interval:

https://github.com/google/site-kit-wp/blob/a39ca71b6bf69e3f3b07646bc016fec54c8909c7/includes/Modules/Analytics_4/Synchronize_Property.php#L101-L109

On balance, I would stick with the server-side approach - it's the sort of thing WP Cron is intended for, we are already using it in a similar manner as seen in Synchronize_Property, and we should be able to remove an unneeded settings field. I'd also go with the approach that Evan suggested, adding an additional googlesitekit_is_module_connected filter and instantiating the required settings instance as needed.

That said, I have no objection to taking the client-side route, and as mentioned it could have some UX benefits. We should just remember to pick up the points raised in that Slack conversation if we do so.

aaemnnosttv commented 8 months ago

Thanks for your thoughts @techanvil

It's worth noting that we originally switched from the client-side approach as a result of a Slack conversation where we identified that we would ideally want some additional client-side changes to avoid having to call the module-specific hook in a common component like DashboardMainApp.

Regardless of which route we go, this is something we need to decouple anyways as we shouldn't have module-specific code baked into core infrastructure like core stores or components. I have it on my list to open an issue to address this very soon.

I'll let you guys decide which way to go on this, I just wanted to weigh in with those thoughts and avoid making a more complicated change than we need here 👍

eugene-manuilov commented 8 months ago

Ok, since we don't have a consensus here, I think we should go back to the original approach and implement it on the client side even though it is not perfect. @zutigrm, could you please update IB and file a new issue so that we can return back to it later and improve?

zutigrm commented 8 months ago

@eugene-manuilov Deal, will do. Thanks!

zutigrm commented 8 months ago

Hi @tofumatt I am trying to locate the issue that adds this action syncAdSenseLinkedState described in design doc, which will fetch the AdSense linked accounts from GET:adsense-links datapoint, but I am not able to find it. Can you point me to it? We have this one #8046, but it only creates a REST endpoint.

Thanks

tofumatt commented 8 months ago

From the discussion here, the consensus actually seems to be that we should go with the server-side approach (see @techanvil's suggestion "On balance, I would stick with the server-side approach"). While there are a few small, potential benefits for the client-side approach regarding refreshing the data, even that isn't really an issue for this case.

CleanShot 2024-02-05 at 16 32 18

@techanvil's Synchronize_Property example seems to be a straightforward way to implement this.

I'm reverting the ACs to the server-side approach; I don't think the client needs to be involved at all in this process, and any ease in its implementation isn't worth the added complexity in approach/understanding. We need the server code regardless that will sync things, but we don't actually need the client to be part of this syncing.

tofumatt commented 8 months ago

Hi @tofumatt I am trying to locate the issue that adds this action syncAdSenseLinkedState described in design doc, which will fetch the AdSense linked accounts from GET:adsense-links datapoint, but I am not able to find it. Can you point me to it? We have this one #8046, but it only creates a REST endpoint.

Thanks

I think that will need to be added as part of this issue, so I added it to the ACs to clarify. The code the REST route in #8046 gets the linked accounts, but to compare them we should add a new function to Analytics 4 module that ensures the current AdSense account ID and the linked account ID match.

techanvil commented 8 months ago

Hi @zutigrm, thanks for drafting this IB. It covers the AC, but I think the approach is more complicated that it needs to be.

As mentioned in this comment above, if we are using WP Cron, we don't need to use the adSenseLinkedLastSyncedAt setting for scheduling. We also don't need to make a check every 12 hours for something we're going to trigger every 24 hours. We simply need to schedule a cron event every 24 hours, following similar logic to Synchronize_Property, passing time() + DAY_IN_SECONDS for the timestamp to wp_schedule_single_event().

I realise we do have it specced in https://github.com/google/site-kit-wp/issues/8051 to display the adSenseLinkedLastSyncedAt value in Site Health, so if we want to keep it for that purpose, we should simply update adSenseLinkedLastSyncedAt in response to a successful sync and show it in Site Health, and that should be the extent of its use.

We could conceivably do away with it entirely, and instead show the timestamp of the next scheduled sync in Site Health, which is what wp_next_scheduled() returns.

@tofumatt, what do you think - is the Site Health requirement something we can reconsider or do we definitely need adSenseLinkedLastSyncedAt to display there? Or have I missed some other reason we do need it?

tofumatt commented 8 months ago

@techanvil Hah, you're right, it's not needed anymore for scheduling anymore. That's a good point. 👍🏻

I think it's worth keeping the adSenseLinkedLastSyncedAt setting though, to log when the status was last synced. This can help users understand why something they recently linked in Analytics/AdSense on the Google side still isn't showing in Site Kit (eg. if they synced 12 hours ago but only linked their accounts 2 hours ago). So it's worth keeping, but agreed, it doesn't need to be there for any reason other than letting users know when the last sync was.

tofumatt commented 8 months ago

(I'm going to update the ACs here actually.)

techanvil commented 8 months ago

Sounds good, thanks @tofumatt!

zutigrm commented 8 months ago

@techanvil @tofumatt Thank you both. @techanvil I updated the IB, based on the Matt's reply, I kept adSenseLinkedLastSyncedAt, but it is only updated after API data is retrieved, not used in any checks any more. And cron schedule is changed to 24h

techanvil commented 8 months ago

Thanks @zutigrm, that looks better. One more thing that I didn't notice previously, the following points are not quite correct.

  • Check if in returned array there is still the account matching the one from existing adSenseLinked setting. You can use adClientCode property from the linked accounts objects to check for a match.
    • If they are not matching, update the adSenseLinked setting with returned data. Otherwise set it to false

As per the AC, the check that should be made is for the AdSense account ID in settings matching one of the links. In fact, it should specifically be a check for the clientID setting matching one of the adClientCode values (as mentioned in the design doc).

Furthermore the adSenseLinked value should be a boolean, true if a match is found and false if not.

zutigrm commented 7 months ago

@techanvil Oh indeed, it seems I misunderstood that part. Updated

Thanks

techanvil commented 7 months ago

Top stuff, thanks @zutigrm. IB LGTM. :white_check_mark:

zutigrm commented 7 months ago

Unassigning myself as I am occupied with SAM issues, in case someone else can get to it before.

If it is not picked till I finish SAM issue, I will pick it again, due to the long history

cc: @bethanylang

10upsimon commented 7 months ago

@zutigrm I've picked this up, but will ping if I need any clarity. IB looks good though and I've read all the back and forth as well :)

zutigrm commented 7 months ago

QA:ENG Verified ✅

By the way this could have also be tested by regular QA using WP Crontrol plugin for checking/triggering crons, and pasting snippets to call api get/set methods for retrieving the settings and updating the clientID for AdSense module