endlessm / azafea

Service to track device activations and usage metrics
Mozilla Public License 2.0
10 stars 2 forks source link

Improve shell_app_is_open to allow easier and faster reports #109

Closed liZe closed 4 years ago

liZe commented 4 years ago

This PR brings two improvements:

duration is a float representing difference between stopped_at and started_at in seconds. It’s implemented as a SQLAlchemy default value, as the lines are not supposed to be changed.

https://phabricator.endlessm.com/T29220

liZe commented 4 years ago

I didn't do any testing, but the shell_app_is_open has close to 200M records and in this case we may want to move the population of the duration column to a separate task that we can perform while Azafea is running.

I’m currently importing 4 months of real data (it was already long to export…), I’ll have a rough estimation of the time needed for this small number of lines.

@adarnimrod If it’s too long (and it probably is), what’s the best way for you to perform this as a separate task? Should I put a default value in the deployment and let you launch SQL on the database to calculate the real values?

wjt commented 4 years ago

Here's how it was done previously e3fd3df855d82c3629a1a90fd435d915bf25dddf

adarnimrod commented 4 years ago

@liZe previously we added a new command to Azafea (like the commit @wjt mentioned). The workflow was doing the deployment like always (running migrations, replacing old containers with the new containers, then running the command in a separate container).

liZe commented 4 years ago

@liZe previously we added a new command to Azafea (like the commit @wjt mentioned). The workflow was doing the deployment like always (running migrations, replacing old containers with the new containers, then running the command in a separate container).

OK. I’m currently working on this new command and its tests.

Now that I have imported real data (it was incredibly long), I’m able to test it on a large number of lines.

liZe commented 4 years ago

I’ve added a separate command to set the duration time. I’m not really happy with the default value set to 0, but I can’t find a better solution (I don’t think that allowing NULL values is a good idea).

Chunks are used, as they are for other commands. I’m not sure that it’s as useful as it is for other commands, it may be really slower, I’ve not tested it on real data yet (but I can if you want, as soon as my shell_app_is_open is ready :unamused:).

@wjt @adarnimrod Let me know if I can improve this PR before merging!

adarnimrod commented 4 years ago

@liZe looks good. The reason behind using chunks was to avoid running out of memory. We can set the default value to null and later on have another migration to disallow null values.

adarnimrod commented 4 years ago

Deployed to the test and dev environments. Worked without an issue and the migrations worked correctly. Merging.