Closed baudje closed 2 years ago
Hey--I think at the very least the double counting will be resolved by #312, which should have one row per aggregation (if I understand the language; e.g. with 2 columns, tap_on and tap_off). @mjumbewu is planning to review the littlepay docs, so we can get the table right.
Here's a list of pieces we're planning in the next couple weeks. It will be a mix of technical onboarding, research into high-level questions, and hopefully producing the table for #312...
One aggregation can have more than 2 transactions as it will likely contain all taps of a single day (microtransactions). So best to have a table with aggregations and a table with microtransactions so that you can join them at will
Ah, okay--is there a term for a single tap on -> tap off "session"? I'm also happy to look around in the littlepay docs for it..
I think that’s what they call micro transactions that each constitute of taps. Tap on only for most, tap on tap off for mst.
Op 13 sep. 2021 om 17:08 heeft Michael Chow @.***> het volgende geschreven:
Ah, okay--is there a term for a single tap on -> tap off "session"? I'm also happy to look around in the littlepay docs for it..
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/cal-itp/data-infra/issues/362#issuecomment-918290199, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABKGIZR22JTEI3IKF772JQTUBYHX5ANCNFSM5DX4RKBA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
sent to your email the long thread, but to group tap ons / taps offs, here's the method
To confirm the methodology for counting riders by line, I am joining the micropayments table to the device transaction table via the micropayment_device_transaction table.
To count riders, I then count the number of district transactions / micropayment where the charge_type is either complete_variable_fare or flat_fare, and remove the pending_charge_fare rows, as those represent tap ons?
On Mon, Sep 13, 2021 at 9:19 AM baudje @.***> wrote:
I think that’s what they call micro transactions that each constitute of taps. Tap on only for most, tap on tap off for mst.
Op 13 sep. 2021 om 17:08 heeft Michael Chow @.***> het volgende geschreven:
Ah, okay--is there a term for a single tap on -> tap off "session"? I'm also happy to look around in the littlepay docs for it..
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub< https://github.com/cal-itp/data-infra/issues/362#issuecomment-918290199>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/ABKGIZR22JTEI3IKF772JQTUBYHX5ANCNFSM5DX4RKBA>.
Triage notifications on the go with GitHub Mobile for iOS< https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android< https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cal-itp/data-infra/issues/362#issuecomment-918357090, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANHXYRHYYP7R5KXMF5TAZLUBYQBJANCNFSM5DX4RKBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hmm, I cannot find the charge_type in Littlepays API reference - are you not connecting to the new API? Not sure if it is correct. To count unlinked trips per period, the method may work, but it's not how i would construct it (at least not with the customer api). To count unique pax per period, i would focus on unique funding source id's in a period.
@baudje here's the doc / definitions I have from the Data Feed Spec - it's in the micropayments
table
note we don't have access to to the customer API, but rather, the Data Feed API (aka static files dumped to S3 that we mirror over)
Ah, I haven’t seen that spec. That explains
Op 15 sep. 2021 om 18:22 heeft Hunter Owens @.***> het volgende geschreven:
note we don't have access to to the customer API, but rather, the Data Feed API (aka static files dumped to S3 that we mirror over)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cal-itp/data-infra/issues/362#issuecomment-920168410, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABKGIZWBH4JZAXE42QBY2GTUCDB5JANCNFSM5DX4RKBA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
This can be addressed after #704. For now I'll mark as "blocked".
@baudje There are now a few fields available in the payments_rides
view that come in from the customer_funding_source
dataset:
principal_customer_id
-- The best proxy for identifying a unique riderfunding_source_vault_id
stg_cleaned_funding_source_vaults
tablestg_cleaned_funding_source_vaults
contains deduplicated data organized by vault IDcustomer_id
customer_id
available on the transaction data, but the one from the micropayment data seems to consistently be most up-to-date)stg_cleaned_customers
tablestg_cleaned_customers
contains the latest customer_id
/primary_customer_id
associationsbin
/masked_pan
/card_scheme
/issuer
/form_factor
is there a table for the payments that includes:
BIN or truncated_pan / Scheme / funding_source_id / micropayment.adjustments
to create anwer questions like:
is the double counting already resolved? in the table it looks like still each transaction id is reported twice.
Maybe we should have one table that reports all unique LP transactions (aggregations), and one for all taps/micropayments that we can join at will.