TabbyML / tabby

Self-hosted AI coding assistant
https://tabby.tabbyml.com/
Other
18.25k stars 767 forks source link

Indexing so slow? #2494

Closed brian316 closed 16 hours ago

brian316 commented 4 days ago

Describe the bug Indexing takes 1+ hour. I am using open ai Text-embedding-3-small for my embeddings. Also the logs are so dry I don't get any useful information. is there any way i can turn on better logging i don't see anything in the docs.

Information about your version Version 0.12.0

Information about your GPU

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.99                 Driver Version: 555.99         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2070 ...  WDDM  |   00000000:2D:00.0  On |                  N/A |
| 15%   55C    P0             46W /  215W |    1490MiB /   8192MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

Additional context My system:

My docker-compose file

version: '3.5'

services:
  tabby:
    restart: always
    image: tabbyml/tabby
    command: serve --device cuda
    environment:
      - TABBY_DISABLE_USAGE_COLLECTION=1
    volumes:
      - "C:/Users/brian/.tabby:/data"
    ports:
      - 8080:8080
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

the only logs i get are not that useful

image

image

Open AI dashboard shows some usage but lots of downtime image

Running on windows 11. shows low to no usage image

wsxiaoys commented 4 days ago

In 0.12 we used very conservative parallelism for http embedding backend - thus the slowness is somewhat expected. Could you share the scale of your github repositories? (e.g # of lines + # of files).

Filing https://github.com/TabbyML/tabby/issues/2495 to track the logging improvement.

One thing you might try is to turn on RUST_LOG=debug inside of container, which should gives you more information on the progress.

wsxiaoys commented 1 day ago

Just FYI we also worked on some concurrency improvement to make the indexing faster - if interested, consider give 0.13.0-rc4 a try to see if it improve situation for you.

wsxiaoys commented 16 hours ago

0.13.0 is now released - this situation should be significantly improved, feel free to reopen the issue if you still encounter slowness in indexing.

brian316 commented 15 hours ago

ill have to check it thanks!