endlessm / azafea

Service to track device activations and usage metrics
Mozilla Public License 2.0
10 stars 2 forks source link

metrics: Deduplicate image version events #81

Closed bochecha closed 4 years ago

bochecha commented 4 years ago

Looking at the database, we have many requests which contain lots of image version events.

The most I found in production is a request with 139 identical image version events.

This does two things:

  1. it filters incoming requests so that only a single image version event (at most) is added per request;
  2. it brings in a new CLI tool to deduplicate the existing image version records so that the DB only contains one per request;
bochecha commented 4 years ago

@adarnimrod once this is deployed, can you run the following Azafea command in production?

$ azafea metrics-2 dedupe-image-versions

Testing it here with data similar to prod, it took about 30 minutes to complete, and the peak required memory was a bit less than 1GB.

After running it, the image_version table is about 1.6 times smaller.