SDITools / adobeanalyticsr

R Client for Adobe Analytics API v2.0
Other
18 stars 9 forks source link

support segment dimension #27

Closed benrwoodard closed 12 months ago

benrwoodard commented 3 years ago

example

{ "rsid": "xxxxx", "globalFilters": [ { "type": "segment", "segmentId": "s300006681_5fb4322d9bfec132bd9dbd73" }, { "type": "dateRange", "dateRange": "2020-10-01T00:00:00.000/2020-11-17T00:00:00.000" } ], "metricContainer": { "metrics": [ { "columnId": "metrics/orders:::0", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_1" ] }, { "columnId": "metrics/orders:::2", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_3" ] }, { "columnId": "metrics/orders:::4", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_5" ] }, { "columnId": "metrics/orders:::6", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_7" ] } ], "metricFilters": [ { "id": "STATIC_ROW_COMPONENT_1", "type": "segment", "segmentId": "s300006681_5fb436e8c89a963fe7b144fc" }, { "id": "STATIC_ROW_COMPONENT_3", "type": "segment", "segmentId": "s300006681_5fb436e88b03436b0b75d7d1" }, { "id": "STATIC_ROW_COMPONENT_5", "type": "segment", "segmentId": "s300006681_5fb436e87d03f65a10e92c56" }, { "id": "STATIC_ROW_COMPONENT_7", "type": "segment", "segmentId": "s300006681_5fb436e81212e663fcb463b3" } ] }, "settings": { "countRepeatInstances": true, "dimensionSort": "asc" }, "statistics": { "functions": [ "col-max", "col-min" ] } }

charlie-gallagher commented 2 years ago

Attractive idea. We would need to post-process the data like workspace does, make one column with 4 rows instead of 4 columns with one row. Other considerations:

Between #23 and this, I would choose this

charlie-gallagher commented 2 years ago

I've expanded the metric container function to support segment IDs on a per metric basis (it's a branch of my fork called "feature/segrows", but I think we should consider moving this issue's feature to its own function, rather than including it in aw_freeform_table.

If you have 10 segments, they can all be queried in 1 request, so it's great if you have 10 segments. But if you have 10 segments and break each one down by a dimension, you have to make 10 requests all the same, but the post-processing is much more difficult on our end.

If we pulled this into its own function, we could have two cases:

  1. Just segments and metrics. Call a specialized function to quickly get all segments.
  2. Segments with a dimension breakdown. Call aw_freeform_table and do some post-processing to make the output look consistent with whatever output we have for case (1)
benrwoodard commented 2 years ago

From my experience, for what it's worth, I've only used segments as a dimension when I'm comparing segments not when I'm wanting to break down or search or any other of the advanced features that are in the aw_freeform_report() function. I'd be all for having a separate function that calls just this use case. Similar to what we see with the anomaly function.

charlie-gallagher commented 2 years ago

Thanks for your thoughts on this, I added my work to the branch feature/segrows so you can review it -- frankly it's a great feature, I'm excited about it. Should it go in the next release? Maybe, we've already changed a lot.

But it's a feature we don't have right now -- it allows you to request metrics with no dimensions, only segments. This almost closes #93, as well.

segs <- c(
  "s1383_5b1fe23aa90b463a44b06e4f",
  "s1383_5b1fe24be395c23c99a48c4f",
  "s1383_617ff79c770dbc37e1e6c9d8"
)

aw_segment_table(
  segmentIds = segs,
  metrics = c("visits", "visitors", "bounces", "pageviews")
)

## # A tibble: 3 × 6
##   name                          id                              visits visitors bounces pageviews
##   <chr>                         <chr>                            <dbl>    <dbl>   <dbl>     <dbl>
## 1 _Tier 1 (Hit)                 s1383_5b1fe23aa90b463a44b06e4f 2470577  1807045  456489   8586561
## 2 _Tier 2 (Hit)                 s1383_5b1fe24be395c23c99a48c4f  760982   667184  368836   2191573
## 3 _CG Homepage Hero Click (Hit) s1383_617ff79c770dbc37e1e6c9d8   64691    61481       3         0
benrwoodard commented 2 years ago

After working with this in a few different scenarios I'm going back on my statement. A breakdown would well worth the effort to integrate.

charlie-gallagher commented 2 years ago

I agree to a certain extent, we'll have to be clear about exactly what types of tables can be made.

The easiest, and maybe the most common, is a list of segments with each segment broken down by the same dimension(s). That's as easy as calling aw_freeform_table once with each segment, adding a column with the segment name, and row binding them together. I might have time to work on this today.

But, other types of tables (segments broken down by other segments, dimension values broken down by a list of segments, etc.) require more thought and effort, and building the output data frame is less obvious, especially when it comes to naming columns.

benrwoodard commented 2 years ago

All really good thoughts to consider. My initial thought on the scope is maybe based on a single use case where I want to see 2 Target (A4T) segments, control and test, and the breakdown of Device Type or New vs Returning visitors. In other words, we lay out the intended result very clearly and set it as a 'limitation' much like that of the anomaly report. It has a specific purpose but can be used in a building block. What made me rethink this was the fact that for Marketing Channel breakdown of the 2 or 3 segments could be as many as 10-12 different function calls.

charlie-gallagher commented 2 years ago

Sorry, I'm not sure I understand why 2 or 3 segments would generate 10-12 function calls? I was thinking we would just call aw_freeform_table once for each segment via map.

lapply(my_segs,
  function(seg) aw_freeform_table(segmentId = seg, ...)
)

We might be able to squeeze more efficiency out of it by making some more optimized queries. The technique I used in aw_segment_table might be a start. But we would be talking about a potentially large effort, and since most people aren't comparing >10 segments usually, I don't think the benefits outweigh the costs.

But maybe I've misunderstood your point? Lemme know

benrwoodard commented 2 years ago

I think you are right on this one. Thanks for walking through that with me.