mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154 stars 33 forks source link

Add a memory logger #821

Closed gregtatum closed 1 month ago

gregtatum commented 1 month ago

I found this really useful for managing OOM issues in the merge mono task by measuring the memory while running things.

Example live log.

This PR is blocking submitting the HPLT importer, as it's using the memory logger as well.

eu9ene commented 1 month ago

Nice idea! I agree it can be more convenient to look at memory allocation in the logs, but we also have the GCP dashboard where you can filter things by the machine ID. For example: translations-1-b-linux-large-gcp-300gb-gd4dp9-ht0mwubpmall1yq for your task.

[taskcluster 2024-08-30 00:14:09.606Z] Hostname: translations-1-b-linux-large-gcp-300gb-gd4dp9-ht0mwubpmall1yq
gregtatum commented 1 month ago

Nice idea! I agree it can be more convenient to look at memory allocation in the logs.

I also like it because I can see it locally when I'm developing the script to ensure I'm not screwing things up with my mental model of what's being loaded in memory. It was quite handy on getting some of my scripts working on production levels of data.