Load model files using the ijson streaming parser, if available, which avoids the need to hold multiple copies of large arrays in memory during the loading process. For a model like the spoken Italian one, this reduces the peak memory usage during model loading from over 3GB to around 1.5GB - little more than the model requires long-term once fully loaded.
Note: this algorithm requires that dict iteration order match insertion order, which is only guaranteed starting from Python 3.7 per spec and CPython 3.6 per implementation. On older Python versions, or if ijson is not available, fall back to the previous loading algorithm.
Load model files using the ijson streaming parser, if available, which avoids the need to hold multiple copies of large arrays in memory during the loading process. For a model like the spoken Italian one, this reduces the peak memory usage during model loading from over 3GB to around 1.5GB - little more than the model requires long-term once fully loaded.
Note: this algorithm requires that
dict
iteration order match insertion order, which is only guaranteed starting from Python 3.7 per spec and CPython 3.6 per implementation. On older Python versions, or if ijson is not available, fall back to the previous loading algorithm.Closes #9