Hello Every one,
In our team, we encountered a problem with MT5 during inference and we need advice.
Previously, when we load and predict a label with MT5, about 10G of RAM was used, but now the model uses only 3G of RAM in its predictions.
Could this be related to some possible improvements in architecture or the volume of T5 in recent months?
Have you had such an improvement in hardware resource usage/volume, or should we be looking for another reason for this?
Hello Every one, In our team, we encountered a problem with MT5 during inference and we need advice.
Previously, when we load and predict a label with MT5, about 10G of RAM was used, but now the model uses only 3G of RAM in its predictions. Could this be related to some possible improvements in architecture or the volume of T5 in recent months? Have you had such an improvement in hardware resource usage/volume, or should we be looking for another reason for this?