Closed bochecha closed 4 years ago
So making a generic command like this is a bit of an ugly beast…
Ideally I'd like Azafea plugins to be able to add their own subcommands, so in this case this would become:
$ azafea -c config.toml activation normalize-vendors
Since that command would be implemented by the activation event processor, it would know which model/column to normalize.
However that's hard to implement well and it might require rethinking the way event processor plugins register into Azafea.
We did need to normalize the existing vendors though, so for now we went with a quick adhoc script due to lack of time for doing the above.
As that script would become the basis for a dedicated command, I'll paste it below:
import sys
from azafea.config import Config
from azafea.model import Db
from azafea.vendors import normalize_vendor
from azafea.event_processors.activation.v1 import Activation
def progress(current, total):
bar_length = 60
done = int(bar_length * current / total)
remaining = bar_length - done
print(f'\r|{"#" * done}{" " * remaining}| {current} / {total}', end='')
def renormalize_chunk(start, stop):
with db as dbsession:
for activation in dbsession.query(Activation).order_by(Activation.id).slice(start, stop):
activation.vendor = normalize_vendor(activation.vendor)
dbsession.add(activation)
CHUNK_SIZE = 5000
config_file = sys.argv[1]
config = Config.from_file(config_file)
db = Db(config.postgresql.host, config.postgresql.port, config.postgresql.user,
config.postgresql.password, config.postgresql.database)
with db as dbsession:
num_activations = dbsession.query(Activation).count()
for i in range(0, num_activations, CHUNK_SIZE):
stop = min(i + CHUNK_SIZE, num_activations)
renormalize_chunk(i, stop)
progress(stop, num_activations)
progress(num_activations, num_activations)
print('\nAll done!')
Something like:
This will be necessary to really fix #22.
And since the current vendor mapping is something I threw down quickly, it's going to go through improvements over time, and so we will need such a command when that happens.