Closed lyricnz closed 1 year ago
I think postgres is fairly efficient - the storage of a text field is only the number of characters+1. So we'd save 1 byte per row by changing gnaf_pid, and 3 bytes per row changing postcode to a shortint.
At 14.7m rows, this would save ~57MB on data
Postgres doesn't have a one-byte integer (to use for a State enum) so changing the state to an int would only save 1-2 bytes. Probably not worth the hassle and extra code. Changing it to a CHAR(3) would do the same.
Edit: could use a CHAR type (a single character)
Updating gnaf_pid to CHAR(14) and postcode to SMALLINT:
❯ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 86ed372521d9 39 minutes ago 2.97GB
lukeprior/nbn-upgrade-map-db latest 8a2b859be4ba 4 days ago 3.01GB
Or, more accurately 2972585117 vs 3014560925 (1.4% saving)
FWIW it seems (experimentally) that using CHAR(n) columns types actually consumes n+1 bytes of storage, which is the same as TEXT, so no point in changing gnaf_pid to CHAR(14). Ditto for state as CHAR(3).
We could save a couple of bytes per row using a CHAR for state
and mapping it to/from client side, but this doesn't seem worth the bother.
Following on from the work in https://github.com/LukePrior/nbn-upgrade-map/issues/255 and related to (but maybe obsoleted by?) the work in https://github.com/LukePrior/nbn-upgrade-map/issues/268 we might be able to save a little space by using shorter columns, and/or a numeric.
Current schema is
However: