Closed robinsones closed 5 years ago
Sometimes actually none of them work 😕
two_weeks_ago <- today() - days(14)
landed <- tbl_views_snowplow_user_page_views() %>%
filter(page_url_path %in% c('/courses', '/courses/'),
page_view_start >= two_weeks_ago,
page_url_host == 'www.datacamp.com',
is.na(signed_in_user_id)) %>%
select(tstamp = page_view_start, domain_userid)
registered <- tbl_snowplow_register_1() %>%
filter(root_tstamp >= two_weeks_ago) %>%
inner_join(tbl_redshift('snowplow_atomic.events') %>%
filter(collector_tstamp >= two_weeks_ago) %>%
select(root_id = event_id, event,
event_tstamp = collector_tstamp, domain_userid,
user_id),
by = c("root_id", "root_tstamp" = "event_tstamp")) %>%
select(tstamp = root_tstamp, domain_userid, user_id)
registration_funnel <- landed %>%
after_join(registered,
by_time = "tstamp",
by_user = "domain_userid",
mode = "left",
type = "first-first")
exercises <- tbl_main_user_exercises() %>%
filter(started_at >= two_weeks_ago,
completed) %>%
select(timestamp = started_at, user_id)
This fails whatever type you use
registration_funnel %>%
filter(!is.na(user_id)) %>%
rename(timestamp = tstamp.y) %>%
select(-tstamp.x) %>%
after_join(exercises,
by_time = "timestamp",
by_user = "user_id",
mode = "left",
type = "first-any")
but it works when you collect then run.
@dgrtwo, anthony and I have pinpointed the problem at least for the first-firstafter join. It's happening here: https://github.com/datacamp/funneljoin/blob/master/R/after_join.R#L156
We figure it's some sort of tidyeval issue, but we can't figure it out. It's especially confusing as this code up here works: https://github.com/datacamp/funneljoin/blob/master/R/after_join.R#L138
The error is:
Error in result_create(conn@ptr, statement) :
Failed to prepare query: ERROR: column "tstamp" does not exist in nfmnrlxsup
fixed by this commit
Something is wrong with "first-firstafter" with remote tables. This works:
as does collecting the two tables, converting both user_ids to integers, and then running the join.
But this does not:
It gets the error "Error in result_create(conn@ptr, statement) : Failed to prepare query: ERROR: column "tstamp" does not exist in tvxfchveli"