Closed akmiller01 closed 10 months ago
Think this is the culprit: https://github.com/IATI/refresher/blob/develop/src/library/solrize.py#L311
doc['id'] = utils.get_hash_for_identifier(json.dumps(doc))
The document in the lake still has both transactions (id 83d4df71ad40ab2b40e08fcfa7d96c9c86c00c2d ).
However when we come to run the final solrize stage, that line generates a SOLR id by just hashing the data, so 2 data elements which are exactly the same will get the same SOLR id.
Note that very same activity has 2 transactions for 30,000 but becuase they are slightly different you can see 30,000 twice in the screenshot above (they have different receiver-org details)
now on develop for testing
The example in this bug report is now correct in production data store
Brief Description The Unified Platform transaction records for a particular activity are missing one transaction. IATI identifier is
XI-IATI-EC_ECHO-ECHO/-AF/BUD/2018/92048
. Examination of the underlying XML shows there should be two identical transactions of 60,000 EUR:But an API query to
/datastore/transaction/select?q=iati_identifier:"XI-IATI-EC_ECHO-ECHO/-AF/BUD/2018/92048"&fl=transaction_transaction_date_iso_date,transaction_value
shows only one.The data is correct at the activity level
/datastore/activity/select?q=iati_identifier:"XI-IATI-EC_ECHO-ECHO/-AF/BUD/2018/92048"&fl=transaction_transaction_date_iso_date,transaction_value
Severity High
Issue Location
/datastore/transaction/select?q=iati_identifier:"XI-IATI-EC_ECHO-ECHO/-AF/BUD/2018/92048"&fl=transaction_transaction_date_iso_date,transaction_value
Steps to Reproduce Add a list of actions needed to replicate the error. Steps to reproduce the behavior:
Expected Results/Behaviour Two identical transaction rows for 60,000 EUR each.
Actual Results/Behaviour One transaction row for 60,000 EUR.