Open hyunsube opened 4 months ago
Do you mean maintaining AI models in memory (or GPU memory)? Uploading models in GPU memory causes a bit of delay, and some apps may this delay may be negligible as you said.
How bout apps provide their frequency level for AI when the apps are registered on MCA?
No, I mean that launching AI model binary file automatically.
I'd implemented Dbus auto launching using systems when d-bus request has been received from MCA
Running AI model seems inefficient, because AI inference may not be requested very frequently. So, AI model should be launched on demand and terminated.