Closed liZe closed 4 years ago
I didn't do any testing, but the
shell_app_is_open
has close to 200M records and in this case we may want to move the population of theduration
column to a separate task that we can perform while Azafea is running.
I’m currently importing 4 months of real data (it was already long to export…), I’ll have a rough estimation of the time needed for this small number of lines.
@adarnimrod If it’s too long (and it probably is), what’s the best way for you to perform this as a separate task? Should I put a default value in the deployment and let you launch SQL on the database to calculate the real values?
Here's how it was done previously e3fd3df855d82c3629a1a90fd435d915bf25dddf
@liZe previously we added a new command to Azafea (like the commit @wjt mentioned). The workflow was doing the deployment like always (running migrations, replacing old containers with the new containers, then running the command in a separate container).
@liZe previously we added a new command to Azafea (like the commit @wjt mentioned). The workflow was doing the deployment like always (running migrations, replacing old containers with the new containers, then running the command in a separate container).
OK. I’m currently working on this new command and its tests.
Now that I have imported real data (it was incredibly long), I’m able to test it on a large number of lines.
I’ve added a separate command to set the duration time. I’m not really happy with the default value set to 0, but I can’t find a better solution (I don’t think that allowing NULL values is a good idea).
Chunks are used, as they are for other commands. I’m not sure that it’s as useful as it is for other commands, it may be really slower, I’ve not tested it on real data yet (but I can if you want, as soon as my shell_app_is_open
is ready :unamused:).
@wjt @adarnimrod Let me know if I can improve this PR before merging!
@liZe looks good. The reason behind using chunks was to avoid running out of memory. We can set the default value to null
and later on have another migration to disallow null
values.
Deployed to the test and dev environments. Worked without an issue and the migrations worked correctly. Merging.
This PR brings two improvements:
duration
column allowing Metabase users to query duration with no SQL (0c2f39c).duration
is a float representing difference betweenstopped_at
andstarted_at
in seconds. It’s implemented as a SQLAlchemy default value, as the lines are not supposed to be changed.https://phabricator.endlessm.com/T29220