google / site-kit-wp

Site Kit is a one-stop solution for WordPress users to use everything Google has to offer to make them successful on the web.
https://sitekit.withgoogle.com
Apache License 2.0
1.22k stars 277 forks source link

Register modules/analytics datastore #1224

Closed felixarntz closed 4 years ago

felixarntz commented 4 years ago

Feature Description

While we don't need to prioritize registering datastores for all the individual modules yet, we should prioritize doing this specifically for Analytics already, as a pre-requisite for #1101.

This is effectively the first issue which is about refactoring existing code to use our new APIs. Hence, there may be some quirks to be figured out here, to set a solid baseline for following work.

Some general suggestions:


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

The data store should hold two contexts: Gutenberg calls these "view" and "edit". Essentially:

Public APIs (selectors/actions) should always get/set to the "edit" store, and only the saveSettings() action should update the "view" store on success.

selectors

Settings selectors

Individual Settings selectors These selectors should always return the edit data (whatever changes have been made on the client but not neccessarily sent to the API). These should each call getSettings internally and return their data from the result of this selector.

State-ish selectors

All selectors below may be fulfilled by an API request so these will all have fetch*, receive*, and receive*Failed actions internally and an accompanying resolver:

@felixarntz's note: I would advise to modify the GET:tag-permission datapoint on the server to accept accountID and propertyID instead of just tag, since we already have these two values then, and it would allow to simplify and speed up that datapoint's functionality. We can keep support for tag for backward compatibility if we don't want to update existing code.

See: https://github.com/google/site-kit-wp/issues/1224#issuecomment-597947394 for more context on these selectors and their required side-effects/API calls.

actions

Actions that will be fulfilled by an API request will all dispatch internal/private fetch*, receive*, and receive*Failed actions internally.

Settings Actions

Individual Settings Actions

State-ish actions

Creation actions

Datapoints

Implementation Brief

QA Brief

Changelog entry

felixarntz commented 4 years ago

@aaemnnosttv @tofumatt Let's plan this and define the ACs together: Let's discuss the general suggestions I added in the description first. Once that is clear, let's focus on the acceptance criteria.

@aaemnnosttv Feel free to already add selectors, actions etc. for the datastore - that should be rather independent of the overall architectural approach.

tofumatt commented 4 years ago

Some thoughts on the description here 😄

The specific modules are not part of Site Kit core's public API, they only use the public API. While there will be a transition phase, they should use the same public API (or soon-to-be public API) functions that 3P plugins would be able to use.

All that sounds good. 👍Use "public APIs" (eg. import Foo from 'googlesitekit-foo';) whenever possible.) Internal imports should never be considered "public" APIs (if they were intended to be public, we'd export them), so this is a good setup.

  • To differentiate we should put the specific module code...
    • ...either in module-specific subdirectories of a new assets/js/googlesitekit/module
    • ...or in module-specific subdirectories of a new assets/js/modules/v2 (or jsr or whatever temporary directory name)
    • In other words, the code for this issue should be either in assets/js/googlesitekit/module/analytics or assets/js/modules/v2/analytics.

All of the code in the existing modules/ folder look like React components. Currently the Site Kit plugin puts components in various places:

Instead of this, let's follow the React app convention and put all new components in the assets/js/components/ directory. It makes it easy to find all components and in my experience sorting components based on any other criteria (eg connected/non-connected, etc.) just gets confusing, tough-to-manage, and requires a lot of moving around when components internal approach changes.

I understand the desire to differentiate code and as this is a temporary measure sounds fine when the code will live alongside old code. Let's go with "JSR"—having "v2" around as a folder name inside the codebase strikes me as a bit awkward. ("JSR" is awkward too because it's a bit "inside baseball", but at least it's unknown until explained, rather than easily misinterpreted until explained.)

  • Modules themselves should not expose any public API other than through their registered store(s).
  • There should be an assets/js/googlesitekit/module/index.js or assets/js/modules/v2/index.js file that loads the code for all modules (for this issue, only for Analytics).
  • That file should be imported by an assets/js/googlesitekit-module.js entry point that uses a googlesitekit-module handle in PHP. (Most likely, this asset needs to depend on all of our new API assets.)

Data stores should also be colocated. Regardless of them being public or private, we should keep all of our similar code in the same folders and have them maintain similar structure. This makes it very easy for us to maintain/find our code.

For public APIs, we already have assets/js/googlesite-*.js files, and these control what we export publicly. We may even want to clearly mark them by using a assets/js/public-apis/ folder or something. But if these modules have data stores, I think they should reside in the same assets/js/googlesitekit/datastore folder that other stores live in.

felixarntz commented 4 years ago

@tofumatt I mostly agree with all that you're saying, however let's keep the module-specific (module as in e.g. Analytics, Search Console, not technically JS module) pieces (components and store) separate. Modules are basically like extensions, and the modules that we ship are our "default bundled extensions". That should be reflected in how we organize our code as well. You could think about it as we're almost a 3P ourselves using our APIs when it comes to modules.

Let's go with the following directories:

The -jsr suffixes should be removed in the future once the refactoring is done (dreaming...).

tofumatt commented 4 years ago

Modules are basically like extensions, and the modules that we ship are our "default bundled extensions". That should be reflected in how we organize our code as well. You could think about it as we're almost a 3P ourselves using our APIs when it comes to modules.

Ah, that's a good point. Okay, with that in mind and the file structure you suggest that all sounds good.

assets/js/modules/{slug}/components-jsr for any new/refactored components specific to the respective module

In terms of setting an example for third-parties: I'd think that third-party modules would layout their code how they liked and end up with a webpack bundle same as us, so I'm not sure how useful this is to third-party developers, and again makes us think about where to store components rather than just having a single place. We can always easily trace back where a component lives from the module's code (assets/js/modules/{slug}/), so bundling "module-specific" components doesn't strike me as a useful delineator.

At the very least: let's just be okay with moving stuff out of here and into the main /components. A lot of our components in these modules might be useful to include elsewhere in the app.

I'd prefer not to store components outside the main components-jsr/ folder, because then it's tougher to see all components at a glance, and it requires a developer to think about where the component goes and how it fits into the plugin for every new component. We should encourage component creation to take advantage of React's very composable nature, adding any friction to the component creation process is a productivity drainer.

Main the main assets/js/modules/{slug}/index.js could be the "root" component, but that's about it to me. We can use the APIs like a third-party developer, but there's no reason for them to have to lay out their uncompiled code like we do.

The -jsr suffixes should be removed in the future once the refactoring is done (dreaming...).

😄

felixarntz commented 4 years ago

@tofumatt

At the very least: let's just be okay with moving stuff out of here and into the main /components. A lot of our components in these modules might be useful to include elsewhere in the app.

Totally agree with that. I think module-specific components should be in assets/js/modules/{slug}/components-jsr though. Most of the components currently within assets/js/modules/{slug} actually are entirely module-specific. There is definitely a ton of reusable things that we could split out from there, and anything that is reusable like that should go into assets/js/components-jsr, like you're saying. But a module-specific component that e.g. is composed of multiple generic reusable components (those would be in assets/js/components-jsr) with the sole purpose of rendering data from the Analytics datastore should be in the module-specific directory.

felixarntz commented 4 years ago

@aaemnnosttv Can you take over the ACs for this one regarding which selectors and actions should be added?

felixarntz commented 4 years ago

@tofumatt I discussed the proposed actions and selectors with @aaemnnosttv for a bit. I would like to avoid the context parameter/option throughout, as it is overly verbose and there is currently no need for us to have it. I understand the thinking behind it, but there is most likely nowhere in current Site Kit behavior where we would need to have the current (potentially client-side updated) and the last saved value available at the same time.

I do agree it makes sense to store the settings in two separate objects internally in the store (one for current, one for last saved), as that would allow us to know whether some settings still need to be saved (e.g. through a needsSave selector). That could be used e.g. to disable/enable a button to save.

However, for the public API, the actions and selectors shouldn't have the context parameter. If we ever find that there's a need to get the last saved value, we should add more verbose separate selectors specifically for that (e.g. getSavedAccountID). I'd also argue that seeing a getAccountID() and getSavedAccountID() would furthermore better clarify how these two differ than a single getAccountID( context ). Yet, let's only add the standard selectors and actions for latest values (regardless of whether saved or not) at this point, plus a selector to indicate whether the settings have changed from their saved state and therefore still need to be saved.

felixarntz commented 4 years ago

@aaemnnosttv @tofumatt Another thought: I see some of the actions you defined actually retrieve data rather than changing any (e.g. getAccountsPropertiesProfiles etc). These should probably be called fetch... instead, and we should have selectors for them.

More specifically: The getAccountsPropertiesProfiles() should be something internal, and it should rather be fetchAccountsPropertiesProfiles() (and at that point, it's also more obvious for it to be an action).

We should define our public selectors in more intuitive and smaller units. Specific examples:

tofumatt commented 4 years ago

Ignore this comment. Turns out I misunderstood how we'd be selecting data in the components intending to use this store. Read below my informed response 😅


@aaemnnosttv @tofumatt Another thought: I see some of the actions you defined actually retrieve data rather than changing any (e.g. getAccountsPropertiesProfiles etc). These should probably be called fetch... instead, and we should have selectors for them.

Agreed, but I'm trying to work out how that works. The issue with the resolvers is that they mark themselves as "fulfilled" after the first time a resolver completes. Gutenberg's code implies that there is a way to override this behaviour and manually:

The approach for the first option is tested here: https://github.com/WordPress/gutenberg/blob/527809b55a00ef9c02f4d709372438441e999764/packages/data/src/test/registry.js#L309

And the second is here: https://github.com/WordPress/gutenberg/blob/527809b55a00ef9c02f4d709372438441e999764/packages/data/src/test/registry.js#L450


Unfortunately, I'm not able to replicate this behaviour in our own unit tests with this approach, modifying getConnection()'s resolver as an example to have it fire on every call to the selector:

    getConnection: {
        *fulfill(...args) {
            console.warn('fulfill CALL', args);

            try {
                const connection = yield actions.fetchConnection();

                yield actions.receiveConnection( connection );

                yield actions.unresolveGetConnection();

                return;
            } catch ( err ) {
                // TODO: Implement an error handler store or some kind of centralized
                // place for error dispatch...
                return actions.receiveConnectionFailed();
            }
        },
        isFulfilled( ...args ) {
            console.log( 'isFufilled()', args );
            return false;
        },
        shouldInvalidate( action, ...args ) {
            console.error( 'shouldInvalidate', action, args );
            return action.type === UNRESOLVE_GET_CONNECTION;
        },
    },

shouldInvalidate is never called when expected. The whole thing seems a bit off...

I might be missing something, but the point is for selectors we always want to pull fresh data with we need a way to invalidate their resolvers. So originally I proposed using actions for things that should always have side effects and selectors for things that only have a side-effect the first time.

But I agree that it's a confusing API and is an implementation leak. It's not good, but I'm still trying to sort out how to get the resolvers to behave as expected.


There seem to be some issues with the resolver implementation or at least a lack of clarity in its intended usage (https://github.com/WordPress/gutenberg/issues/19132). From my grepping through Gutenberg's code I can't actually find a place where these pieces of code are actually used (https://github.com/WordPress/gutenberg/search?q=shouldInvalidate&unscoped_q=shouldInvalidate aside from https://github.com/WordPress/gutenberg/blob/4857ad58c1241b3d63d21a6880c989b85746c3dc/packages/core-data/src/index.js#L38 so it's difficult to find clear instructions on the intended usage.

I'll keep exploring, but just wanted to provide context on why actions were chosen for get requests we would normally use a selector for.

tofumatt commented 4 years ago

So! Ignore my above comment because I missed that the selectors we would need to dispatch again have arguments and thus will run on first call!

@felixarntz is right that:

should all be selectors. I totally missed that ones that would require re-fetches (running resolvers again) took arguments and thus our concerns about invalidating resolvers isn't a concern. Resolvers will resolve once per selector + argument combo, so a call to getPropertiesProfiles( { accountID: 1 } ) will fire a resolver, and then a subsequent call to getPropertiesProfiles( { accountID: 2 } ) will fire it again with the new arguments. I don't think we need to worry about invalidating anything manually here, so let's make those selectors + resolvers instead of actions + controls. Sorry I led you astray on that one, my bad.


In terms of the implementation detail leak on the selectors: I agree with that too. I think we can expose less of our API implementation/optimisation to the selectors. I'll have another go at this AC now, but just wanted to clarify why I went off on the resolver tangent! 😅

felixarntz commented 4 years ago

I think we should have the following:

The resolvers for each of the three selectors should call the respective fetch action.

Theoretically we could also need the following:

There's also some side effects that should happen in certain circumstances.

This is mainly due to how our "default setting matching" and the dropdowns should work:

In other words, profiles should behave a tiny bit different because it's very likely the first one is the only one (typically "All website data"), so we'd avoid unnecessary clicks. For accounts and properties, unless there's a matching or something has been saved already, we should not assume anything (i.e. those should default to "Select an account..." / "Select a property..." then).

tofumatt commented 4 years ago

I wanted to clarify one thing about my recommended approach to action/selector arguments. I mentioned that when we were using both a value to set and a context for where to set it, there was sense in having an argument for the data to set and an argument for the metadata around where it should be set. Given we're now always interacting with the "edit" context, I think it makes sense to nix the "options" argument for now and even to ditch the first argument where it doesn't add clarity. Here's what I mean:

In the case of setAnonymizeIP( { value } ) vs setAnonymizeIP( value ), the second is very direct, readable, and requires less boilerplate. data, value, etc. are very generic words and are best to avoid if possible. setAnonymizeIP( true ) is a more straightforward API to read/write/understand than `setAnonymizeIP( { value: true } ). There isn't a benefit to using objects there, I think. And with only one, basic argument we can extend it to an object later while maintaining backward-compat if the params grow.

Where it does make sense to use object args is where they add clarity to selectors/actions over positional arguments. setExistingTag( { propertyID, accountID = null } ) is an example of this:

setExistingTag( { propertyID: 4, accountID: 5 } ) is much more readable and clear than if we used positional arguments setExistingTag( 4, 5 ).

Even something with one argument can benefit from an object argument when it helps self-document the function call. getProperties() is a good example of that. getProperties( 5 ) is much less clear than getProperties( { accountID: 5 } ).

So I think it's best not to be rigid in our rules about using objects for the API on action creators. I know I mentioned wanting payload to always be consistent but I think that's a bit different, because it actually makes reducer code easier to reason about and also doesn't force redundant keys like value: for public APIs. I hope that makes sense and is okay 😄

felixarntz commented 4 years ago

@tofumatt Regarding your above comment: Mostly agreed, we should definitely not use objects for actions like setAnonymizeIP etc.

Regarding the object approach, I partly agree. I think there is a nuance you are not addressing above that we should think about before going with that, which is the requirement for certain parameters: In the case of e.g. getProfiles( accountID, propertyID ) both of these are required, which to me would be less obvious if it was getProfiles( { accountID, propertyID } ). You shouldn't be able to call getProfiles( { propertyID } ) for example. I would prefer if we relied on objects only for:

For something like getProperties( accountID ) or getProfiles( accountID, propertyID ), these parameters would always be required and likely these functions would never be changed. I see what you're saying regarding getProperties( 5 ), but to be fair the way this would typically look in our codebase would rather be getProperties( accountID ).

Objects to me make sense for multiple optional parameters or for functions that either have many required parameters or have a high chance of receiving more (optional) parameters in the future. Our public APIs should be concise, so we would want to avoid many parameters altogether, except potentially for config/options, for which I would then agree to use objects.

I do agree with you regarding action creators, as in payload should always be an object. That is internal though, so there it is perfectly fine. Our public selectors and actions should mainly have individual parameters though.

felixarntz commented 4 years ago

One thing missing from the ACs (regardless of the above):

felixarntz commented 4 years ago

Related: I started defining ACs for #1247 - let's continue general discussion here though, and I'll update the ACs for that one according to what we decide here.

aaemnnosttv commented 4 years ago

public selector needsSettingsSave() (please please please suggest a better name, I'm not sure :) ) --> would indicate whether current client-side settings are different from latest server-side settings

This is essentially dirty-checking so perhaps areSettingsDirty() would be an appropriate name here?

When we persist settings to the server, and update the saved settings in state, would we use individual setSaved*( value ) internal actions for that or a single setSavedSettings( value ) ?

should all be selectors. I totally missed that ones that would require re-fetches (running resolvers again) took arguments and thus our concerns about invalidating resolvers isn't a concern. Resolvers will resolve once per selector + argument combo, so a call to getPropertiesProfiles( { accountID: 1 } ) will fire a resolver, and then a subsequent call to getPropertiesProfiles( { accountID: 2 } ) will fire it again with the new arguments. I don't think we need to worry about invalidating anything manually here, so let's make those selectors + resolvers instead of actions + controls. Sorry I led you astray on that one, my bad.

@tofumatt there is one case where we would need to make the request again, even if no parameters had changed and that is when we refresh accounts. This will likely not be needed in the future once we introduce provisioning, but currently the create account option just directs users to the service page so they can create their account there. Then once they're done, they can come back and click the button to reload their accounts again (same parameters). We do this for both Analytics and Tag Manager for sure, but I think we also do it for AdSense. If this is a special-case where we need to manually dispatch the action, I think that's fine, but there are a few use cases where we need this.

felixarntz commented 4 years ago

@aaemnnosttv

This is essentially dirty-checking so perhaps areSettingsDirty() would be an appropriate name here?

Hmm, I would prefer something that relates more to the "save" action. Otherwise we could also name saveSettings something like cleanSettings as in makeSettingsNotDirty :) Maybe needsSaveSettings (weird grammar too, but indicates relation with saveSettings action)?

When we persist settings to the server, and update the saved settings in state, would we use individual setSaved*( value ) internal actions for that or a single setSavedSettings( value ) ?

Any internal actions related to API requests should be named fetch{publicAction} (for triggering the request) and receive{publicAction} (for receiving the response), that was something @tofumatt and I defined earlier. This keeps them clearly marked for a certain purpose. So these would be fetchSaveSettings and receiveSaveSettings. So I think that should be a single one for all settings at once. This would also be part of the things that later can get abstracted out into a generic module settings portion of the store that is reusable for all modules.

@tofumatt there is one case where we would need to make the request again, even if no parameters had changed and that is when we refresh accounts. This will likely not be needed in the future once we introduce provisioning, but currently the create account option just directs users to the service page so they can create their account there. Then once they're done, they can come back and click the button to reload their accounts again (same parameters). We do this for both Analytics and Tag Manager for sure, but I think we also do it for AdSense. If this is a special-case where we need to manually dispatch the action, I think that's fine, but there are a few use cases where we need this.

This is a valid point, but we shouldn't need to think about this when defining our public API (or rather, it shouldn't be affected by how we think we can accomplish this). Looks like @tofumatt found a way to invalidate a resolver, so we'd be good here. And even if we weren't, manually calling the fetch* and receive* actions would work as a fallback. Something like get* should likely always be a selector, indicated by the function name IMO.

felixarntz commented 4 years ago

Leaving a note here that the property creation and profile creation should be split out (but not removed) from the POST:settings datapoint, into new POST:create-property and POST:create-profile datapoints.

felixarntz commented 4 years ago

@aaemnnosttv We don't have an "Acceptance Criteria Review" column, but would be great if you could just ping me on Slack once you've iterated on the actions+selectors here and they're ready for a review.

felixarntz commented 4 years ago

Some feedback from looking at the current list of selectors/actions:

tofumatt commented 4 years ago

I think this is looking good and it's best that we start moving forward with implementing this as defined in the ACs (the IB is essentially "implement the ACs" 🙂), and iterate on it in the pull request once we have the code and tests in front of us.

The discussion here is great but getting a bit long and unwieldy—I think we're in a good place with it now and don't want things to get confused.

@felixarntz Are you happy with the ACs as-is? If so, I think an IB like "Implement these ACs" is fine and we can move this to the execution backlog. Then anymore comments can be inline with code on the PR, which should make further discussion easier-to-track 🙂

felixarntz commented 4 years ago

@tofumatt Sounds good to me.

I've made a few small changes to the ACs:

One important note:

felixarntz commented 4 years ago

@aaemnnosttv @tofumatt There may not be a need for this, but how do you think we should deal with this issue? Technically we do have the datastore now, but there may be some extra changes required.

I think we covered everything. If we notice things that are missing, let's open individual issues and PRs to fix/add them.

tofumatt commented 4 years ago

Agreed; I think we can consider this issue resolved. There's still datastore work to do (although largely covered after #1278 in terms of things we 100% need), but we should consider this issue finished and do anything else in follow-ups. This one is already quite large and will be difficult to consume if it gets bigger 😄 👍

cole10up commented 4 years ago

I ran a series of regression tests around Analytics with the latest develop.

Confirmed install, setup, functionality of the Analytics module from from the dashboard and Analytics left nav page.

All good on my end.

Passed QA ✅