NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.48k stars 2.4k forks source link

RAM memory leaks for EncDecCTCModelBPE at inference #9428

Closed vikcost closed 3 weeks ago

vikcost commented 3 months ago

Numbers suggests that EncDecCTCModelBPE has RAM memory leaks. I'm running inference with batch_size=1 on a directory with 10k small audio files with total duration of around 9.5 hours.

Does any body know where this excessive RAM usage comes from?

Here is a sample script:

import time
from pathlib import Path

import psutil
import torch
from nemo.collections.asr.models import EncDecCTCModelBPE

torch.set_num_threads(1)
proc_usage = psutil.Process()
cur_usage_rss = proc_usage.memory_info().rss / 1024 / 1024
print(f"Initial rss usage:{cur_usage_rss}")

dir_name = "/mnt/data/subset_10k"
audio_paths = Path(dir_name)

def find_leaks():
    model = EncDecCTCModelBPE.from_pretrained(model_name="stt_en_conformer_ctc_medium")
    res_file_path = f"memory_usage3_start_{time.time()}.txt"
    fd = open(res_file_path, "w")

    files = audio_paths.glob("*.wav")
    for idx, a_path in enumerate(files):
        res = model.transcribe(str(a_path), verbose=False)

        if idx % 100 == 0:
            cur_usage_rss = proc_usage.memory_info().rss / 1024 / 1024
            msg = f"{idx: <10} rss usage:{cur_usage_rss}"
            print(msg)
            print(msg, file=fd)

    fd.close()

if __name__ == "__main__":
    find_leaks()

And a sample output:

0          rss usage:1433.66796875
100        rss usage:1561.16796875
200        rss usage:1584.16796875
300        rss usage:1596.66796875
400        rss usage:1607.16796875
500        rss usage:1629.9375
600        rss usage:1636.4375
700        rss usage:1660.2890625
800        rss usage:1660.2890625
900        rss usage:1661.2890625
1000       rss usage:1672.7890625
1100       rss usage:1680.7890625
1200       rss usage:1688.7890625
1300       rss usage:1695.7890625
1400       rss usage:1697.2890625
1500       rss usage:1706.2890625
1600       rss usage:1718.09765625
1700       rss usage:1718.09765625
1800       rss usage:1721.09765625
1900       rss usage:1725.09765625
2000       rss usage:1738.59765625

Expected behavior

Constant memory consumption over time.

Environment overview (please complete the following information)

Environment details

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 4 weeks ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 3 weeks ago

This issue was closed because it has been inactive for 7 days since being marked as stale.