nlpodyssey / spago

Self-contained Machine Learning and Natural Language Processing library in Go
BSD 2-Clause "Simplified" License
1.74k stars 86 forks source link

Nearly 4 times the memory usage when compared to python for the same model #104

Closed abishekmuthian closed 10 months ago

abishekmuthian commented 3 years ago

I ran memory profiling for the code https://github.com/nlpodyssey/spago/issues/103 and spago version uses 3.9 GB when compared to 1.2 GB of python. The model sizes are similar valhalla/distilbart-mnli-12-3 , it is 2.5 GB after transforming (hf-importer) to spago and where as upstream python version is 2.1 GB.

Memory profiling in spago

memory_prof

Memory profiling in Python

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
     7    217.3 MiB    217.3 MiB           1   @profile
     8                                         def classify():
     9   1227.3 MiB   1010.0 MiB           1       classifier = pipeline("zero-shot-classification", model="models/distilbart-mnli-12-3")
    10                                         
    11   1227.3 MiB      0.0 MiB           1       sequence = "PalmOS on Raspberry Pi"
    12   1227.3 MiB      0.0 MiB           1       candidate_labels = ["startup", "business", "legal", "tech"]
    13                                         
    14                                         
    15   1235.1 MiB      7.8 MiB           1       res = classifier(sequence, candidate_labels, multi_label=True, truncation=False)
    16                                         
    17   1235.1 MiB      0.0 MiB           5       for i, label in enumerate(candidate_labels):
    18   1235.1 MiB      0.0 MiB           4           print("%d. %s [%.2f]\n" % (i, res['labels'][i], res['scores'][i]))

Is this expected? Spago can be very useful in low memory environments like ARM SBC to conducted CPU bound inference, But the memory usage needs to optimized.

Python version seems to be faster in overall operation timing as well because loading of configuration, model weights takes variable timing in spago.

matteo-grella commented 2 years ago

Hey @abishekmuthian

after this change, we experimented that the memory usage is 2 times Python for the same model.

We've probably found a way to reduce further, I'll keep you posted!

abishekmuthian commented 2 years ago

That's good news! Bringing the memory consumption on-par with python would definitely motive more to adopt spago for their ML workflow.

matteo-grella commented 10 months ago

With the new release the memory usage is close to PyTorch version.