Open robindebois opened 3 years ago
@robindebois I just came across the same issue when working with Mambu GL Journal Entries endpoint and it seems to me that creation_date
can't be used as a replication key because value of that replication key is eventually passed to API request parameters which are:
from | Start date - The first booking date you want to retrieve journal entries from. Required
-- | --
to | End date - The last booking date you want to retrieve journal entries from. Required
I'm not entirely sure, but I guess that booking date
== entry_date
which forces us to use this column as a replication key, otherwise we'd be comparing apples to oranges (i.e. passing creation_date
to API request which is then translated into booking date
by mambu API).
I came across this because I realized that every day, a couple of journal entries were not loaded into my database when using INCREMENTAL replication, exactly because the difference between their creation_date
and entry_date
is bigger than the width of my incremental replication strategy.
@robindebois I can see that this tap uses neither creation_date
or entry_date
but booking_date
: https://github.com/singer-io/tap-mambu/blob/master/tap_mambu/schema.py#L95-L99 which seems to be most relevant. Seems to me that the issue can be closed.
I tried replicating gl_journal_entries
data incrementally with the up-to-date mambu tap but it misses all journal entries which have booking_date
more than 1 day earlier than creation_date
. So the ledger is still incomplete.
The only solution I can think of for now is swapping replication method from INCREMENTAL to FULL.
Hi Tom,
The tap uses the v1 POST /gljournalentries:search
endpoint (not the GET one, which you seem to suggest).. which in its body has the following filter criterium (see https://github.com/singer-io/tap-mambu/blob/master/tap_mambu/sync.py#L680):
{
"filterConstraints": [
{
"filterSelection": "CREATION_DATE",
"filterElement": "BETWEEN",
"value": '{gl_journal_entries_from_dt_str}',
"secondValue": "{now_date_str}"
}
]
}
As far as my ticket was concerned -- let's step over the intense confusion in which the Mambu documentation leaves me on value/booking and entry dates -- I think both booking_date
and entry_date
or value_date
might not be the fields to replicate on incrementally.
Basically every search iteration, it should always get all the entries made between the most recently captured creation_date and now; and the reason is that all the other dates can be in the past, so occurring before the bookmark in the state file.
I am not 100% sure if the behaviour displayed by the tap is intentional or accidental; please feel free to discard issue is this was by design.
Expected behaviour
When loading journal entries with the tap I expected to (eventually) end up with all journal entries that are available in Mambu.
Actual behaviour
The tap does an incremental load with the
booking_date
as a replication key. However, there are multiple reasons why the choice of key leads to an incomplete ledger:This also means for accounting purposes this data becomes useless, as the sum of all general ledger accounts doesn't sum to 0 anymore.
Steps to reproduce
How to fix:
As far as I can tell changing the replication key to
creation_date
will resolve it.