jupyterlab / jupyter-ai

A generative AI extension for JupyterLab
https://jupyter-ai.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
3.11k stars 306 forks source link

Ollama starcoder2:15b inline completions model does not work #896

Open pedrogutobjj opened 1 month ago

pedrogutobjj commented 1 month ago

Hi everyone, I had previously posted about this "error", could anyone help me? As you can see, when trying to make the inline completion model work nothing happens, only the GPU usage goes to 100% practically and nothing happens in the code line. Follow the inline completion configuration screens and my Jupyter screens.

image

image

image

krassowski commented 1 month ago
  1. Does chat work with the same model?
  2. Does completion work with a different, non-local model?
  3. Does setting completion streaming to "always" help?

My first guess would be that the model just takes so long on your machine. Until you can confirm that it works in the chat but not with the completion, then this would be a reasonable assumption.

krassowski commented 1 month ago

Also, are you using the latest JupyterLab version?

pedrogutobjj commented 1 month ago
  1. On chat model i'm using gemma2, and works fine.

image

  1. I tested some huggingface models a few weeks ago and it worked normally.

  2. nothing happens when I select the option, the code suggestions still do not appear.

  3. image

krassowski commented 1 month ago

From your pictures it looks like you are expecting the completion to appear in a new line (based on the position of your cursor). Currently it works by completing the text you star typing in an existing line, you need type 2 or 3 characters and wait. What exact version of JupyterLab are you using?

krassowski commented 1 month ago

My question is if chat model with the same exact model works, not with a different model. Conversly, if you try using Gemma for inline completion does it work for you?

pedrogutobjj commented 1 month ago

I actually wait a bit to see if any code suggestions appear, but nothing happens,

my jupyterlab version

image

if i use both models(completion model and inline completion model), like gemma2 , its work`s fine!

image

pedrogutobjj commented 1 month ago

image

pedrogutobjj commented 1 month ago

however, starcoder for code suggestions is much better than gemma2.

pedrogutobjj commented 1 month ago

Another situation, when I wait for some code to "auto complete" it "resets" the code from the beginning, from the import, and '''' python ''' appears, would this be normal? Can't we make this more fluid?

image

richlysakowski commented 1 month ago

How do i unsubscribe from this Github.com repository thread or channel? I want to monitor activity more passively, because I get a message anytime anyone posts anything, and it is maxing out my Gmail account everyday.

I keep trying but apparently I have only unsubscribing to a single thread?

Thank you.

Best Regards,

Rich Lysakowski, Ph.D. Data Scientist, AI & Analytics Engineer, and Senior Business Systems Analyst 781-640-2048 mobile

On Sun, Jul 14, 2024 at 6:45 PM Pedro Asevedo @.***> wrote:

image.png (view on web) https://github.com/user-attachments/assets/99a296cd-6ae4-4395-8e4b-8c0b90507d11

testing starcoder2 in both models (completion model) and inline completion model, nothing happen....

:(

— Reply to this email directly, view it on GitHub https://github.com/jupyterlab/jupyter-ai/issues/896#issuecomment-2227506949, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHJVL2SGNYTYFZXFNIJDGDZML5RPAVCNFSM6AAAAABK3OHIE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRXGUYDMOJUHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

krassowski commented 1 month ago

If other models work but starcoder does not work, it is likely a problem with the model, or in this case also possibly your GPU having less memory than required to run it sufficiently fast to be useful (but I see you managed to run deepseek-coder v2:16b, so unless deepseek-coder is quantized and stardocder is not it would not align with the theory of hardware issue).

@richlysakowski on the main page of the repository (https://github.com/jupyterlab/jupyter-ai) you will see "Unwatch" button, with options to only watch releases.

pedrogutobjj commented 1 month ago

@krassowski

When I put a part of the code, and the model returns several explanations, and the complete code, with explanations and etc... and I wanted it to complete only the missing part. For example, I insert "import pandas as", I expect the model of complete with "pd", but it repeats the import pandas as pd and puts some random explanations, you know? I just want you to complete what I'm writing, don't rewrite everything. I don't know if my doubt was very clear and if this can be configurable.

krassowski commented 1 month ago

If you use streaming mode, this should have been fixed in https://github.com/jupyterlab/jupyter-ai/pull/879 (which will be included in 2.19 release)

pedrogutobjj commented 1 month ago

Thanks very much!

This release is launch today?

pedrogutobjj commented 1 month ago

I'm using the new release, 2.19.... same problem..with completions.. appears `python` , has the "bug" not been fixed?

image

pedrogutobjj commented 1 month ago

image

using codegemma...

Is there any ollama model that has been tested and passed the tests to complete the lines?

krassowski commented 1 month ago

Thanks for testing, the Ollama provider is experimental so there may be issues to iron out. Things I would suspect:

Is there any ollama model that has been tested and passed the tests to complete the lines?

You already know the answer, as it was provided in https://github.com/jupyterlab/jupyter-ai/pull/646#issuecomment-2226135337. Otherwise, there are no systematic tests for individual models.

krassowski commented 1 month ago

I can reproduce the prefix trimming issue with all providers in 2.19.0, whether streaming or not.

krassowski commented 1 month ago

For some reason in 2.19.0 the suggestion includes an extra space. This is logging from GPT-4 without streaming (so logic which should not have changed since 2.18):

image

krassowski commented 1 month ago

Ah no, this was still ollama with phi, not GPT-4. So it looks like ollama output parsing may be off by a spurious whitespace at the beginning.

pedrogutobjj commented 1 month ago

Thanks for testing, the Ollama provider is experimental so there may be issues to iron out. Things I would suspect:

  • a) the models may be not very good at respecting instructions and not generating expected output (especially given that some of the models you listed above are rather small, 9b or 15b), or
  • b) there is some issue with new-line endings Windows-style vs Unix-style

Is there any ollama model that has been tested and passed the tests to complete the lines?

You already know the answer, as it was provided in #646 (comment). Otherwise, there are no systematic tests for individual models.

I liked some results that the gemma2:9b model gave me in response, we could go more in-depth about this model, I test some models daily, the most famous ones and analyze their responses, as I test the models I will report here or elsewhere specific topic.

krassowski commented 1 month ago

@pedrogutobjj did you have a chance to test the latest release, v2.19.1, which includes #900? Is it any better?

pedrogutobjj commented 1 month ago

Hey @krassowski , morning!

i tested some lines of codes this morning, here are the results.

image

krassowski commented 1 month ago

Is this what you would expect or not? To me it looks like syntactically valid response. Of course a bit useless, but this is down to the ability of the model you use.

pedrogutobjj commented 1 month ago

The logic is correct, I mean the overlaps, I don't know if it was clear about this, for example: I inserted "def sum_matrizes(matrix1, matrix2): ...... then comes the autocomplete part, it repeats the def sum_matrizes again as a suggestion, since I already inserted it at the beginning of the code, I don't know if I was able to be clear in my suggestion.

krassowski commented 1 month ago

I see that, but it looks like the model is at fault here. It first inserted "import numpy as" and only then started the "def sum_matrizes(matrix1, matrix2):" part again.

Previously you were testing with deepseek-coder and codegemma but now you posted a result from llama3.1. If we want to see if the changes helped with the issue that you reported back then, can you test with the same models?

pedrogutobjj commented 1 month ago

with deepseek-coder image

with codegemma:7b

image

krassowski commented 1 month ago

Thanks! Just to help me reproduce, where was your cursor when you invoked the online completer?

pedrogutobjj commented 1 month ago

on top of the cell.

krassowski commented 1 month ago

Do you mean that your cursor was before in the first line here:

def soma_matrices(matriz1, matriz2):|

or in the new line:

def soma_matrices(matriz1, matriz2):
|

or in the new line after tab:

def soma_matrices(matriz1, matriz2):
    |
pedrogutobjj commented 1 month ago

image

pedrogutobjj commented 1 month ago

image