[Question] what machine should I use to run this with local LLMs?

*short answer could be: get maxed out M3 Max 128GB RAM ...

This might be a silly question, but what would be the best/ most reasonable machine to run this plugin for using local LLMs? Especially after seeing local beta QA feature, I'd like to make a rough assumption on how much performance we can expect for each machine, as these machines aren't cheap.

I've been testing with my M1 macbook Air with 16GB RAM with LM Studio however, stream responses still take about 10s ~ with n_gpu_layers 24 when answering questions. Indexing with local embedding over 2000 files took a few minutes which seemed reasonable to me though.

Given the pace of seeing better models with 7B or bigger lately, possible quantizations, and also providers like Apple launching newer and newer machines, it's not very clear that which one fits for a long time usage.

logancyang / obsidian-copilot

[Question] what machine should I use to run this with local LLMs? #439