SilasMarvin / lsp-ai

LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.
MIT License
1.82k stars 55 forks source link

How to start model? #13

Closed PrimeTimeTran closed 2 weeks ago

PrimeTimeTran commented 2 weeks ago

I've installed LSP-AI on my local

$ which lsp-ai

/Users/future/.cargo/bin/lsp-ai

But I don't know how to start it. when I run

$ /Users/future/.cargo/bin/lsp-ai             

The console hangs. If I press a key it exits immediately. Not sure what to do next. Please advise.

Thanks for your work!

SilasMarvin commented 2 weeks ago

Thanks for checking the project out.

You should never have to start lsp-ai yourself. You should configure your text editor to do it for you. If you are using VS Code, we have an official VS Code plugin. If you are using helix or neovim, we have some example configurations available for those editors. If you are using a different editor, you will have to checkout your editors documentation around adding a new language server.

PrimeTimeTran commented 2 weeks ago

Thank for getting back so quickly!

I've already installed the plugin and tried both llama & open ai but neither worked. I wrote a comment, then moved my cursor down and invoked the LSP-AI generate but nothing happens.

SilasMarvin commented 2 weeks ago

Of course. What editor are you using? Can you share your config?

PrimeTimeTran commented 2 weeks ago

todo

Screenshot 2024-06-09 at 6 05 37 PM
"lsp-ai.serverConfiguration": {
    "memory": {
      "file_store": {}
    },
    "models": {
      "model1": {
        "type": "llama_cpp",
        "repository": "stabilityai/stable-code-3b",
        "name": "stable-code-3b-Q5_K_M.gguf",
        "n_ctx": 2048
      }
    }
  },
  "lsp-ai.generationConfiguration": {
    "model": "model1",
    "parameters": {
      "fim": {
        "start": "<fim_prefix>",
        "middle": "<fim_suffix>",
        "end": "<fim_middle>"
      },
      "max_context": 2000,
      "max_new_tokens": 32
    }
  },
  "lsp-ai.inlineCompletionConfiguration": {
    "maxCompletionsPerSecond": 1
  },

Hope this helps. I think it's working cause I see a line added but there's no other indicator.

Might it be because I'm on MacOS and I used metal with my install? There isn't any example config for llama with metal so not sure what to put where. Thanks again.

SilasMarvin commented 2 weeks ago

Thank you! In your VS Code terminal in the Output tab, can you change the dropdown from Window to lsp-ai and send me what it prints out?

Are you on mac, did you install with llama_cpp and metal support?

PrimeTimeTran commented 2 weeks ago

Yes I did install with metal per suggestions in the README. Sorry I was trying to illustrate shortcut press. Here's the terminal output.

llama_kv_cache_init:      Metal KV buffer size =   640.00 MiB
llama_new_context_with_model: KV self size  =  640.00 MiB, K (f16):  320.00 MiB, V (f16):  320.00 MiB
llama_new_context_with_model:        CPU  output buffer size =     0.19 MiB
llama_new_context_with_model:      Metal compute buffer size =   152.00 MiB
llama_new_context_with_model:        CPU compute buffer size =     9.01 MiB
llama_new_context_with_model: graph nodes  = 1095
llama_new_context_with_model: graph splits = 2
llama_new_context_with_model: n_ctx      = 2048
llama_new_context_with_model: n_batch    = 2048
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init:      Metal KV buffer size =   640.00 MiB
llama_new_context_with_model: KV self size  =  640.00 MiB, K (f16):  320.00 MiB, V (f16):  320.00 MiB
llama_new_context_with_model:        CPU  output buffer size =     0.19 MiB
llama_new_context_with_model:      Metal compute buffer size =   152.00 MiB
llama_new_context_with_model:        CPU compute buffer size =     9.01 MiB
llama_new_context_with_model: graph nodes  = 1095
llama_new_context_with_model: graph splits = 2
SilasMarvin commented 2 weeks ago

Got it, that looks like its working then. I think it might be a setting in your VS Code configuration.

Can you make sure your Editor > Inline Suggest: Enabled is checked.

Also make sure the value of Editor > Quick Suggestions other is set to inline.

PrimeTimeTran commented 2 weeks ago

I've been wanting to find a os project to work on for some time now but hadn't had any luck. Felt like all other apps have been handled. I hope to be able to contribute to this one since it's so early and I'm sure you could use help and have other ideas. I haven't spun up an electron app is years though, lol. Also I know little about ML and even less about Rust so yeah... appreciate your work, looking forward to this becoming a "have to have" app on VSCode.

SilasMarvin commented 2 weeks ago

I've been wanting to find a os project to work on for some time now but hadn't had any luck. Felt like all other apps have been handled. I hope to be able to contribute to this one since it's so early and I'm sure you could use help and have other ideas. I haven't spun up an electron app is years though, lol. Also I know little about ML and even less about Rust so yeah... appreciate your work, looking forward to this becoming a "have to have" app on VSCode.

I am absolutely looking for help. I have a bunch of ideas and would love to talk about what areas of development interest you! There is a ton of really exciting things to do here.

If you are interested in the VS Code side of things, there is a lot still to do with the plugin we have written: https://github.com/SilasMarvin/lsp-ai/blob/main/editors/vscode/src/index.ts Copilot's plugin has a lot more built out features and while I don't necessarily think we need to match its features, I think there should be some discussion around what features we do want.

If you want to get more into the Rust / language server side of things, there is also a bunch to do there. I want to explore agent based systems and have some discussion around what integrating something like that directly into the backend would look like. I'm currently working on directory crawling and RAG.

PrimeTimeTran commented 2 weeks ago

I just tried RSing VScode and I see this now.

ERROR lsp_ai::transformer_worker: generating response: environment variable not found

Screenshot 2024-06-09 at 6 20 47 PM

If I'm not mistaken I didn't need one if Im using local llama?

SilasMarvin commented 2 weeks ago

Can you close VS Code, go to your terminal in a directory you want to open with VS Code. Run export LSP_AI_LOG=DEBUG and then code ..

This will Open VS Code with the environment variable LSP_AI_LOG=DEBUG which enables debugging for LSP-AI.

PrimeTimeTran commented 2 weeks ago

Regardless I have the OPEN AI one set actually.

Screenshot 2024-06-09 at 6 22 18 PM
ERROR dispatch_request{request=Generation(GenerationRequest { id: RequestId(I32(1)), params: GenerationParams { text_document_position: TextDocumentPositionParams { text_document: TextDocumentIdentifier { uri: Url { scheme: "file", cannot_be_a_base: false, username: "", password: None, host: None, port: None, path: "/Users/future/Documents/Work/_Main/.Projects/experiment/main.py", query: None, fragment: None } }, position: Position { line: 4, character: 0 } }, model: "model1", parameters: Object {"max_context": Number(1024), "max_tokens": Number(128), "messages": Array [Object {"content": String("SOME CUSTOM SYSTEM MESSAGE"), "role": String("system")}, Object {"content": String("SOME CUSTOM USER MESSAGE WITH THE {CODE}"), "role": String("user")}]} } })}: lsp_ai::transformer_worker: generating response: environment variable not found
DEBUG lsp_server::msg: > {"jsonrpc":"2.0","id":1,"error":{"code":-32603,"message":"environment variable not found"}}    
SilasMarvin commented 2 weeks ago

It looks like you meant to set the auth_token not the auth_token_env_var_name https://github.com/SilasMarvin/lsp-ai/wiki/Configuration#openai-compatible-apis

PrimeTimeTran commented 2 weeks ago

If you are interested in the VS Code side of things, there is a lot still to do with the plugin we have written: https://github.com/SilasMarvin/lsp-ai/blob/main/editors/vscode/src/index.ts Copilot's plugin has a lot more built out features and while I don't necessarily think we need to match its features, I think there should be some discussion around what features we do want.

If you want to get more into the Rust / language server side of things, there is also a bunch to do there. I want to explore agent based systems and have some discussion around what integrating something like that directly into the backend would look like. I'm currently working on directory crawling and RAG.

Very interested in building something for progeny. Lemme know what your ideas are and I'll have a look at it? It's your baby so I feel you should have that kinda executive product power right now.

PrimeTimeTran commented 2 weeks ago

auth_token That was it. Thanks~!

SilasMarvin commented 2 weeks ago

Glad I could help! Just a note, the responses from OpenAI right now are not going to be very good with the filler messages. You will want to replace those. I have a section on the wiki for prompting that may be worth looking at

SilasMarvin commented 2 weeks ago

If you are interested in the VS Code side of things, there is a lot still to do with the plugin we have written: https://github.com/SilasMarvin/lsp-ai/blob/main/editors/vscode/src/index.ts Copilot's plugin has a lot more built out features and while I don't necessarily think we need to match its features, I think there should be some discussion around what features we do want. If you want to get more into the Rust / language server side of things, there is also a bunch to do there. I want to explore agent based systems and have some discussion around what integrating something like that directly into the backend would look like. I'm currently working on directory crawling and RAG.

Very interested in getting making something for progeny. Lemme know what your ideas are and I'll have a look at it? It's your baby so I feel you should have that kinda executive product power right now.

I appreciate your thoughts. If you want to participate, I want you to build out something you are excited about too! Check out the following capabilities of Copilot: https://code.visualstudio.com/docs/copilot/overview Let me know what you think about these capabilities and if we should try and match them.

I'm going to mark this as closed but we can continue talking here, or you can create another issue for discussing a good feature to add.

PrimeTimeTran commented 2 weeks ago
Screenshot 2024-06-09 at 9 53 07 PM Screenshot 2024-06-09 at 9 53 09 PM

I really am interested. I even tried to figure out how to fork the wiki and update it earlier but went in circles and gave up. lol.

I've noticed the results are what I expected, far from perfect. For example here the code blocks aren't complete.

I also noticed when languages are "changed" the extension doesn't "catch up" quickly enough. For example imagine I'm building an app in Flutter/Dart but I have Python scripts to do certain things. If I switch from Python to Dart it still generates Python despite being in a .dart file.

Anyway not trying to nitpick just saying. In any case I ended up poking around and saw the extension isn't even using Electron as I initially thought but some other packages so I need to spend some time getting familiar with those. Concerning CoPilot extension, undoubtedly, I'd love to be able to add those so gimme some time to do some research on these libs.

SilasMarvin commented 2 weeks ago

What prompt are you using? The prompt will have a massive impact on how well the LLM performs.

The Python and Dart mixup is interesting. We don't actually specify the language the LLM is supposed to complete, we just pass it the code / comments around the cursor (this also depends on which prompt system you are using), so if you have Python or Dart code around the cursor it should complete correctly, but not always as LLMs are not deterministic and do have some strange quirks.

That sounds great! Let me know what you think, and feel free to make a pull request anytime!

PrimeTimeTran commented 2 weeks ago

Isn't it on line 85, the comment?

We prompt the AI in comments is my understanding . However it's been a while since I used one so correct me if I'm wrong.

SilasMarvin commented 2 weeks ago

Take a look at the prompting guide on the wiki: https://github.com/SilasMarvin/lsp-ai/wiki/Prompting

There is a lot more that goes into it.

PrimeTimeTran commented 2 weeks ago

and feel free to make a pull request anytime!

I'm looking forward to it. I think a killer feature which is missing in these types of tools(in my limited experimentation) is that we should be able to tag/label the file. The label/tag is then joined with the inline comment/prompts.

I expect models could produce better results with this additional feature/dimension. Like add a comment on line 1 and that's always sent with the subsequent prompt requests.

SilasMarvin commented 2 weeks ago

That is a really cool idea! We could add some kind of keyword like:LSP-AI Tag and then search for it in the current file.

A user working on a Python file could use it like so:

# LSP-AI Tag: This file is implementing a fibonnaci sequence 

def fib

Then it could become a variable they can use when building the prompt.

E.G.

{
  "messages": [
    {
      "role":"system",
      "content":"You are a programming completion tool. Replace <CURSOR> with the correct code."
    },
    {
      "role": "user",
      "content": "{LSP-AI_TAG}\n{CODE}"
    }
  ]
}
PrimeTimeTran commented 2 weeks ago

Sure I think the implementation could work like that. My main idea though is that by labeling a bunch of files, collectively they can be joined to deliver much more specific/fine tuned results.

Imagine the README labels the app as a CSM, its framework & dependencies and lsp-ai picks it up. We could then, like you said earlier, use RAG for better results on subsequent methods/classes/files it generates code for.

I would hope provided that a dev could then create a new file named User.model.js and then the model would be able to, without prompt, generate a basic dependency definition for a user model(sequelize vs mongoose for example).

I'm looking into the prompting dance per your suggestion and doing research on all the other plumbing like vscode-languageclient so gimme some time =).

SilasMarvin commented 2 weeks ago

This is a really interesting idea. Let me think on this some more and get back to you on it. I did just implement crawling of the code base in: https://github.com/SilasMarvin/lsp-ai/pull/14 Which will give us some good groundwork to build off of.

There definitely needs to be some thought put into how we build that kind of context for the model. Ideally it wouldn't have to be something the user does manually. It could be part of an agent system where we have an LLM who's job it is to tag files and build the context for the LLM to do completion?

I'll think more on this. Really good suggestions thank you, I'm excited to work on this together!

PrimeTimeTran commented 2 weeks ago
Screenshot 2024-06-09 at 10 55 59 PM

I read the prompts docs you provided but still don't quite get it. Those looked like generation data to me, multiply & is_event for example.

The prompts should be similar to what I've done on line 16 I thought? Tell the AI what I want, fix the error, and it picks up the context by itself.

Screenshot 2024-06-09 at 11 06 30 PM

Here's another example of what I think the expected behavior should be, that the AI can pickup comments and generate from there. Ideally it'd then stop doing what it's does here, suggesting the same thing code despite other "prompts" not being delivered on.

Me too. Gimme some time to catch up on what you've already done & then I'll try to figure out how to "switch" the prompt/context. Unsure of what else the explanation is for this time of generation is currently...

SilasMarvin commented 2 weeks ago

The prompts that I have written for OpenAI are designed to make it perform code completion. You can alter the prompts however you see best! If you want to tell it what you want, and have it generate the code that way, you totally can. Just later the messages you send.

I recommend testing your prompts in the OpenAI playground. I find that to make prompt building easier than constant trial and error with editing the config.

PrimeTimeTran commented 2 weeks ago

Here's my generation prompt config.

"lsp-ai.generationConfiguration": {
    "model": "model1",
    "parameters": {
      "max_tokens": 2000,
      "max_context": 3000,
      "messages": [
        {
          "role": "system",
          "content": "You are a Generative AI fine tuned for software development. You've mastered mobile, web, and AI/ML concepts, languages, frameworks, & conventions. You help the user complete the code you're provided."
        },
        {
          "role": "user",
          "content": "{CODE}"
        }
      ]
    }
  },

Take away for me is that despite defining a much higher token allocation it failed to provide the "complete" code. Not sure what the issue is, my prompt in the config or the editor...?

Slow and steady, lol =)

SilasMarvin commented 2 weeks ago

That is strange it is not providing the complete code. I would test in their playground and see if it works there? Maybe they have some hidden API limits? I don't think our request is malformed.

Slow and steady wins the race!