Open Kamikadashi opened 11 months ago
Am I doing something wrong, or is the pipeline reloading all the models every time an inference is made? This makes inference slow. If there is a way to keep them loaded in memory, I would be grateful to know it.
Am I doing something wrong, or is the pipeline reloading all the models every time an inference is made? This makes inference slow. If there is a way to keep them loaded in memory, I would be grateful to know it.