Segment build - Githubissues

benrwoodard commented 2 years ago

Purpose: The segment building functions will enable the analyst to create segments in a way that will enhance the documentation of an analysis project as well as create major efficiencies in the possibilities of new segments due to the ability to dynamically create segments.

The new functions start with the seg_pred() function which creates the foundational aspects of a segment. An analyst can also create containers using the "and" or "or" conjunctions in the seg_con() or a sequential segment using the "then" conjunction in seg_seq(). If needed, the seg_then function will enable the sequential segment to include "after" and "within" time restrictions. The analyst would then combine these segment elements together in the final seg_build() function which by default returns the JSON string definition but can create the segment using the argument create_seg = TRUE.
The definition JSON string can then be used as a freeform_table() global filter or metric filter (future state). The analyst can also use the JSON string in the seg_val() function to validate if the resulting definition is valid.

charlie-gallagher commented 2 years ago

Hey Ben I've started reviewing this, but it will probably take a while to dig through everything. I'll try to submit reviews every now and then as I go

benrwoodard commented 2 years ago

Thanks! I attempted to be more functional in the solutions but a lot of it is developing json strings using lists based on a series of filtered if statements. I was never able to completely wrap my head around the parts of the segment definition that changed based on the verbs and other variables there are still a fair amount of if statements used. If you have a solution for creating those json string definitions in a more modular way, please let me know and I'll go back and get that done. I'm hoping this is a good phase 1 since It appears to be very stable based on my testing variations .

charlie-gallagher commented 2 years ago

Aside: here are some notes I have about the API

The API uses a JSON format for passing arguments to function

function(a, b, c) 
---- is equivalent to -----
{
  func: "function"
  a: <value>
  b: <value>
  c: <value>
}

Predicates are composed of a function and its arguments

{
  "func": ["matches"],
  "glob": ["Request a Quote"],
  "val": {
    "func": ["attr"],
    "name": ["variables/prop11"]
  }
} 
---- is equivalent to ----
matches(glob = "Request a Quote", val = attr(name = "variables/prop11"))

The arguments to a function are pretty consistent with its type
- All data comparison functions use the val argument except temporal functions.
- String and numeric functions have 1 other argument, one of "str", "num", "glob", "list".
- Event functions use the evt argument.
The val argument is usually an attr function, which simply references a component
- You can modify this attr with a function like total or total-distinct. I don't see total-distinct documented on the page I'm looking at. Might submit an issue to repair that.

Based on this, the idea I had for structure was this:

seg_pred <- function(dimension, metric, verb, object) {
  pred <- list()
  pred$func <- verb
  subject <- get_subject(dimension, metric)         # Modify as needed
  pred <- add_verb_args(pred, subject, object)      # Add arguments. This would be responsible for most of the work.
  pred <- modify_verb_args(pred, is_distinct)       # Modify things, in this case adds `total-distinct` if is_distinct is true

  pred
}

Note: This isn't a criticism, I'm only sharing since you've asked before how I would write certain functions

benrwoodard commented 2 years ago

Aside: here are some notes I have about the API

The API uses a JSON format for passing arguments to function
function(a, b, c) 
---- is equivalent to -----
{
  func: "function"
  a: <value>
  b: <value>
  c: <value>
}
Predicates are composed of a function and its arguments
{
  "func": ["matches"],
  "glob": ["Request a Quote"],
  "val": {
    "func": ["attr"],
    "name": ["variables/prop11"]
  }
} 
---- is equivalent to ----
matches(glob = "Request a Quote", val = attr(name = "variables/prop11"))
The arguments to a function are pretty consistent with its type

All data comparison functions use the val argument except temporal functions.

String and numeric functions have 1 other argument, one of "str", "num", "glob", "list".

Event functions use the evt argument.

The val argument is usually an attr function, which simply references a component

You can modify this attr with a function like total or total-distinct. I don't see total-distinct documented on the page I'm looking at. Might submit an issue to repair that.

Based on this, the idea I had for structure was this:
seg_pred <- function(dimension, metric, verb, object) {
  pred <- list()
  pred$func <- verb
  subject <- get_subject(dimension, metric)         # Modify as needed
  pred <- add_verb_args(pred, subject, object)      # Add arguments. This would be responsible for most of the work.
  pred <- modify_verb_args(pred, is_distinct)       # Modify things, in this case adds `total-distinct` if is_distinct is true

  pred
}
Note: This isn't a criticism, I'm only sharing since you've asked before how I would write certain functions

This was a huge help. My edits, hopefully, reflect a more logical way of building the segment predicates (rules)

benrwoodard commented 2 years ago

@charlie-gallagher I'm planning on merging this branch later tonight or first thing tomorrow. If you have any other red flags, please let me know. I've been over the 7 different functions and back over them and have cleaned them up a lot. On big change recently was removing "predicates" and using "rules". It makes much more sense I think.

charlie-gallagher commented 2 years ago

@charlie-gallagher I'm planning on merging this branch later tonight or first thing tomorrow. If you have any other red flags, please let me know. I've been over the 7 different functions and back over them and have cleaned them up a lot. On big change recently was removing "predicates" and using "rules". It makes much more sense I think.

Thanks for the heads up, and sorry for not responding sooner. I haven't had a chance to look closely at the other files, but if they're working well, I don't see a need to scrutinize them at this stage -- we'll come back to everything when we add unit tests and refactor.

SDITools / adobeanalyticsr

Segment build #120