cfpb / hmda-platform

The HMDA Submission backend applications.
Creative Commons Zero v1.0 Universal
103 stars 94 forks source link

(WIP) add admin delete API endpoint to clear test bank data #4721

Open rkovalik-raft opened 10 months ago

rkovalik-raft commented 10 months ago

Curl command to delete 2020 submission data for Bank0:

curl --location --request DELETE 'http://localhost:8081/delete/B90YWS6AFX2LGWOXJ1LD/year/2020' \
--header 'Content-Type: application/CSV' \
--data '[object Object]'

TODO: Test in Postman.

Note: Even when I set the keyspaces hmda2_journal and hmda2_snapshot and respective tables, I still sometimes see this error: Caused by: com.datastax.oss.driver.api.core.servererrors.InvalidQueryException: Keyspace hmda2_snapshot does not exist

rkovalik-raft commented 10 months ago

Updated to delete all data rows for "Submission", "EditDetail", "HmdaRawData", "HmdaValidationError" persistence id prefixes.

Tested with local cassandra 4 and local platform. Resulting queries in cassandra for delete:

Type: single-query
Query start time: 1699569131753
Protocol version: 5
Generated timestamp:-9223372036854775808
Generated nowInSeconds:1699569131
Query: 
      INSERT INTO hmda2_journal.metadata (persistence_id, deleted_to)
      VALUES ( ?, ? )

Values: 
00000000 48 6D 64 61 56 61 6C 69  64 61 74 69 6F 6E 45 72 HmdaVali dationEr
00000010 72 6F 72 2D 42 39 30 59  57 53 36 41 46 58 32 4C ror-B90Y WS6AFX2L
00000020 47 57 4F 58 4A 31 4C 44  2D 32 30 32 30 2D 39    GWOXJ1LD -2020-9 
-----
00000000 00 00 00 00 00 00 00 0D                          ········         
-----

Type: single-query
Query start time: 1699569131757
Protocol version: 5
Generated timestamp:-9223372036854775808
Generated nowInSeconds:1699569131
Query: 
      DELETE FROM hmda2_journal.journal WHERE
        persistence_id = ? AND
        partition_nr = ? AND
        sequence_nr >= 0 AND
        sequence_nr <= ?

Values: 
00000000 48 6D 64 61 56 61 6C 69  64 61 74 69 6F 6E 45 72 HmdaVali dationEr
00000010 72 6F 72 2D 42 39 30 59  57 53 36 41 46 58 32 4C ror-B90Y WS6AFX2L
00000020 47 57 4F 58 4A 31 4C 44  2D 32 30 32 30 2D 39    GWOXJ1LD -2020-9 
-----
00000000 00 00 00 00 00 00 00 01                          ········         
-----
00000000 00 00 00 00 00 00 00 0D                          ········         
-----

Type: single-query
Query start time: 1699569131757
Protocol version: 5
Generated timestamp:-9223372036854775808
Generated nowInSeconds:1699569131
Query: 
      DELETE FROM hmda2_journal.journal WHERE
        persistence_id = ? AND
        partition_nr = ? AND
        sequence_nr >= 0 AND
        sequence_nr <= ?

Values: 
00000000 48 6D 64 61 56 61 6C 69  64 61 74 69 6F 6E 45 72 HmdaVali dationEr
00000010 72 6F 72 2D 42 39 30 59  57 53 36 41 46 58 32 4C ror-B90Y WS6AFX2L
00000020 47 57 4F 58 4A 31 4C 44  2D 32 30 32 30 2D 39    GWOXJ1LD -2020-9 
-----
00000000 00 00 00 00 00 00 00 00                          ········         
-----
00000000 00 00 00 00 00 00 00 0D                          ········         
-----

Type: single-query
Query start time: 1699569131761
Protocol version: 5
Generated timestamp:-9223372036854775808
Generated nowInSeconds:1699569131
Query: 
    DELETE FROM hmda2_snapshot.snapshot
    WHERE persistence_id = ?
    AND sequence_nr >= ?
    AND sequence_nr <= ?

Values: 
00000000 48 6D 64 61 56 61 6C 69  64 61 74 69 6F 6E 45 72 HmdaVali dationEr
00000010 72 6F 72 2D 42 39 30 59  57 53 36 41 46 58 32 4C ror-B90Y WS6AFX2L
00000020 47 57 4F 58 4A 31 4C 44  2D 32 30 32 30 2D 39    GWOXJ1LD -2020-9 
-----
00000000 00 00 00 00 00 00 00 00                          ········         
-----
00000000 7F FF FF FF FF FF FF FF                          ········         
-----

Type: single-query
Query start time: 1699569131765
Protocol version: 5
Generated timestamp:-9223372036854775808
Generated nowInSeconds:1699569131
Query: 
       SELECT * from hmda2_journal.tag_write_progress WHERE
       persistence_id = ?

Values: 
00000000 48 6D 64 61 56 61 6C 69  64 61 74 69 6F 6E 45 72 HmdaVali dationEr
00000010 72 6F 72 2D 42 39 30 59  57 53 36 41 46 58 32 4C ror-B90Y WS6AFX2L
00000020 47 57 4F 58 4A 31 4C 44  2D 32 30 32 30 2D 39    GWOXJ1LD -2020-9 
-----
rkovalik-raft commented 10 months ago

With neverUsePersistenceIdAgain parameter set to true for cleanup.deletaAll(), if the deletion endpoint is called twice, no persistence ids need to be deleted the second time.