gratipay / inside.gratipay.com

Here lieth a pioneer in open source sustainability. RIP
https://gratipay.news/the-end-cbfba8f50981
57 stars 38 forks source link

Respond to IRS letter #1164

Closed chadwhitacre closed 6 years ago

chadwhitacre commented 6 years ago

First time for everything, friends. :)

The income and payment information we have on file from sources such as employers or financial institutions doesn't match the information you reported on your tax return. If our information is correct, you will owe $9,220 (including interest), which you need to pay by October 5, 2017.

If you agree with the changes we made, [pay us].

If you don't agree with the changes, complete the Response form on Page 9, and send it to us along with a signed statement and any documentation that supports your claim so we receive it by October 5, 2017.

chadwhitacre commented 6 years ago

When I manually truncate the transaction_search.csv to the one record and the gratipay query also to the one record and step through the code carefully, I observe the desired behavior that it spits out one match.

chadwhitacre commented 6 years ago

But when I run over the full csv and query ...

FOUND IT IN BRAINTREE fghrjgg 7872 2015-11-26 12.67
number unmatched: 744
[gratipay] $

No FOUND IT IN GRATIPAY.

chadwhitacre commented 6 years ago

How about with two entries?

chadwhitacre commented 6 years ago

Hrm ... two fails, and now one also fails if I run it straight through vs. set_traceing.

chadwhitacre commented 6 years ago

I'm not forcing it in the query this time. Is that what's different?

chadwhitacre commented 6 years ago

Yes.

chadwhitacre commented 6 years ago

Ooh! Ooh!

(Pdb) gratipay
{None: Record(exchange_id=79004, ref=None, date=datetime.date(2015, 11, 26), amount=Decimal('12.67'), participant_id=7872L)}
(Pdb)

I'm clobbering all null refs down to a single key/value in the gratipay dict! πŸ˜…

chadwhitacre commented 6 years ago

πŸ‘Š

FOUND IT IN BRAINTREE fghrjgg 7872 2015-11-26 12.67
FOUND IT IN GRATIPAY (7872, datetime.date(2015, 11, 26), Decimal('12.67'))
[gratipay] $
chadwhitacre commented 6 years ago

Eeeeee, now getting number unmatched: 2164.

Let's go with a gist for versioning.

chadwhitacre commented 6 years ago

Oops! Still had the query constrained. 😬

chadwhitacre commented 6 years ago

Okay! Phew. Satisfied. Deep cleansing breath. They all match! πŸ˜‡

chadwhitacre commented 6 years ago

Do the payment method tokens also match up?

chadwhitacre commented 6 years ago

(That's exchange_routes.address.)

chadwhitacre commented 6 years ago

Not entirely. Okay! We're not trying to solve all of https://github.com/gratipay/gratipay.com/issues/4442 here. Do we have enough to get what we need yet?

What do we need?

chadwhitacre commented 6 years ago

We need to produce a csv of customer,amount where the amounts sum to $36,428.04 and Gratipay's amount is $18,758.

chadwhitacre commented 6 years ago

I guess that means tracing the inputs through paydays and correlating them with outputs. We need to list the outputs that sum to the input.

chadwhitacre commented 6 years ago

I think we should go ahead and fill in the refs. You okay with that @kaguillera? I will test it out and spot-check it on a backup.

chadwhitacre commented 6 years ago

go ahead and fill in the refs

... because that will be the best way to identify the input set.

chadwhitacre commented 6 years ago

As far as tracing the output set, I guess we will have to use transfers and payments and look at timestamps?

chadwhitacre commented 6 years ago

Okay!

chadwhitacre commented 6 years ago

https://stackoverflow.com/questions/18797608/update-multiple-rows-in-same-query-using-postgresql

update exchanges e
   set ref = tmp.ref
  from (values ('399g55w', 78775)
            -- ...
             , ('nkv6gqw', 78568)
              ) as tmp(ref, exchange_id)
 where tmp.exchange_id = e.id
      ;
chadwhitacre commented 6 years ago

πŸ‘

gratipay-bak=# \i backfill-braintree-2015.sql 
UPDATE 744
gratipay-bak=#
chadwhitacre commented 6 years ago

Rerunning match-2015.py is now a no-op. πŸ’ƒ

chadwhitacre commented 6 years ago

Okay! Refs backfilled locally! Now to trace payments ...

chadwhitacre commented 6 years ago

Working up a query to select the inputs to trace.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import absolute_import, division, print_function, unicode_literals

from gratipay import wireup

db = wireup.db(wireup.env())

inputs = db.all('''

    SELECT count(*), sum(amount + fee)
      FROM exchanges e
      JOIN exchange_routes er
        ON e.route = er.id
     WHERE network = 'braintree-cc'
       AND "timestamp"::text >= '2015-01-01'
       AND "timestamp"::text < '2016-01-01'
       AND amount > 0

''')

print(inputs)
[Record(count=7996L, sum=Decimal('136429.97'))]
chadwhitacre commented 6 years ago

Need to match ... 2,165 and $36,428.

chadwhitacre commented 6 years ago

Constraining to status='succeeded':

(7704, Decimal('133386.20'))

πŸ€”

chadwhitacre commented 6 years ago

Well, I guess we have the refs of all 2,165 transactions that Braintree is counting. What's the pattern difference between those and the extra 5,539 that my query is counting?

P.S. If this rabbit hole is too deep we could punt and just query on ref in the known set from Braintree. Would feel more confident understanding this, though ...

chadwhitacre commented 6 years ago

I think I'm going to add a from_braintree column to my local exchanges table and go from there.

chadwhitacre commented 6 years ago
gratipay-bak=# \i refs.sql 
ALTER TABLE
UPDATE 2165
gratipay-bak=# 
alter table exchanges add column from_braintree boolean not null default false;
update exchanges
   set from_braintree = true
 where ref in ( 'jyb8j8g'
             -- ...
              , '7jkhp32'
               )
      ;
chadwhitacre commented 6 years ago

Golly. My query is only finding 1,687 items from_braintree!

chadwhitacre commented 6 years ago

Null routes?

chadwhitacre commented 6 years ago

No. Phew. πŸ˜“

gratipay-bak=# select count(*) from exchanges where from_braintree and route is null;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚     0 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”˜
(1 row)

gratipay-bak=#
chadwhitacre commented 6 years ago

Double- and triple-phew. πŸ˜…

gratipay-bak=# select count(*) from exchanges where from_braintree;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚  2165 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”˜
(1 row)

gratipay-bak=# select count(*) from exchanges where ref is not null and route is null;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚     0 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”˜
(1 row)

gratipay-bak=#
chadwhitacre commented 6 years ago

Oh! !m us :D

gratipay-bak=# select count(*) from exchanges where route is null;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚     0 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”˜
(1 row)

gratipay-bak=#
chadwhitacre commented 6 years ago

πŸ’ƒ

gratipay-bak=# \d exchanges
                                       Table "public.exchanges"
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Column     β”‚           Type           β”‚                       Modifiers                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  ...
β”‚ route          β”‚ bigint                   β”‚ not null                                               β”‚
  ...
chadwhitacre commented 6 years ago

Convenience view ...

gratipay-bak=# create view ewr as (select e.*, er.network, er.address from exchanges e join exchange_routes er on e.route = er.id);
chadwhitacre commented 6 years ago

Yeah, thought so.

gratipay-bak=# select count(*) from ewr where from_braintree and network != 'braintree-cc';
β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚   478 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”˜
(1 row)
chadwhitacre commented 6 years ago

πŸ€”

gratipay-bak=# select count(*) from ewr where from_braintree and network = 'unknown';
β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚   416 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”˜
(1 row)
chadwhitacre commented 6 years ago

Oh sweet mercy. πŸ™ˆ

gratipay-bak=# select network, count(network) from ewr where from_braintree group by network order by count desc;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”
β”‚   network    β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ braintree-cc β”‚  1687 β”‚
β”‚ unknown      β”‚   416 β”‚
β”‚ paypal       β”‚    31 β”‚
β”‚ balanced-cc  β”‚    31 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜
(4 rows)
chadwhitacre commented 6 years ago

Something in the matcher script?

chadwhitacre commented 6 years ago

Or did we link the wrong routes at some point under gratipay/gratipay.com#3912?

chadwhitacre commented 6 years ago

Looks like 11 routes are implicated:

gratipay-bak=# select route, count(route) from ewr where from_braintree and not network in ('braintree-cc', 'unknown') group by route order by count desc;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”
β”‚ route β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚  2554 β”‚    27 β”‚
β”‚ 12629 β”‚    15 β”‚
β”‚ 12216 β”‚     8 β”‚
β”‚ 11403 β”‚     3 β”‚
β”‚  5144 β”‚     3 β”‚
β”‚ 10402 β”‚     1 β”‚
β”‚ 12440 β”‚     1 β”‚
β”‚  2715 β”‚     1 β”‚
β”‚ 11426 β”‚     1 β”‚
β”‚ 11584 β”‚     1 β”‚
β”‚ 12565 β”‚     1 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜
(11 rows)

gratipay-bak=#
chadwhitacre commented 6 years ago

Alright, how do we avoid caring about this?

chadwhitacre commented 6 years ago

I guess we can just start with the from_braintree in the queryβ€”I don't actually have to run anything against production here, I can do it all from this here backup here.

chadwhitacre commented 6 years ago
gratipay-bak=# select sum(amount+fee), count(*) from inputs;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”
β”‚   sum    β”‚ count β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 36428.04 β”‚  2165 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜
(1 row)
DROP VIEW ewr;
CREATE VIEW inputs AS (SELECT * FROM exchanges WHERE from_braintree ORDER BY "timestamp" ASC)
chadwhitacre commented 6 years ago

There are only a handful of transfers after we started charging on Braintree (https://github.com/gratipay/inside.gratipay.com/issues/1164#issuecomment-330670984). I am going to try ignoring them and only look at payments.

gratipay-bak=# select "timestamp"::date, amount from transfers where "timestamp" >= '05-28-2015'::date order by "timestamp" desc;
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ timestamp  β”‚ amount β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 2016-10-13 β”‚   0.56 β”‚
β”‚ 2016-08-24 β”‚   3.00 β”‚
β”‚ 2015-11-23 β”‚ 147.48 β”‚
β”‚ 2015-09-13 β”‚   0.43 β”‚
β”‚ 2015-07-30 β”‚   0.34 β”‚
β”‚ 2015-07-30 β”‚   0.87 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
(6 rows)

gratipay-bak=#
chadwhitacre commented 6 years ago

Alright, let's trace one exchange through payments. I guess we want to trace it iteratively until it goes out again in another exchange or we hit 2016-01-01. Do we have any regiving to speak of tho?

chadwhitacre commented 6 years ago

payments start with Gratipay 2.0 on 2015-05-15.

chadwhitacre commented 6 years ago

payments are linked to a payday, but exchanges are not.