Open Thomasbehan opened 1 month ago
This will be fixed later in the week when the M3.1s model is released, which is identical to the M3.1 model but quantized to be run more efficiently.
🏷️ This will happen under the 3.1.1 tag.
Will be fixed in 3.2 along side some model architecture changes
Currently the live Demo runs out of memory when trying to run the latest model (M3.1).
This model will work locally on most devices, the model is too large currently for the live Demo which only has 500mb of memory.