spotify / annoy

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Apache License 2.0
12.95k stars 1.15k forks source link

slow work of finding neighbors in docker-compose #547

Closed fortunto2 closed 2 years ago

fortunto2 commented 3 years ago

hi, my docker search_index.get_nns_by_vector work more >30sec without docker in notebook <1sec Why?

Google cloud 16gb ram, 4 core

1mln images - vector (512) 2.5gb search index size

docker-compose

  api: &backend
    image: '${DOCKER_IMAGE_API?Variable not set}:${TAG-latest}'
    depends_on:
      - db
    env_file:
      - .env
    volumes:
    - ./backend:/app
    - ./models:/app/models
    ports:
      - 80:8000
    build:
      context: backend
      dockerfile: Dockerfile
      args:
        INSTALL_DEV: ${INSTALL_DEV-false}
    networks:
      - default
    command: uvicorn main:app --reload --host 0.0.0.0 --port 8000

dockerfile

FROM tiangolo/uvicorn-gunicorn:python3.8

RUN mkdir /fastapi

RUN python -m pip install poetry
COPY poetry.lock pyproject.toml /app/
WORKDIR /app

# Project initialization:
RUN poetry config virtualenvs.create false && poetry install --no-interaction --no-ansi

EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]

docker info

Client:
 Debug Mode: false

Server:
 Containers: 9
  Running: 4
  Paused: 0
  Stopped: 5
 Images: 99
 Server Version: 20.10.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-1042-gcp
 Operating System: Ubuntu 20.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 14.65GiB
 Name: predictor
 ID: BTJ4:ORKM:4NXL:4YOH:UWVA:MI7J:2IEV:ZW6K:VXM5:DN4E:B34L:R3CB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
[tool.poetry]
name = "backend posred"
version = "0.1.0"
description = ""
authors = ["fortunto2 <info@3dstr.ru>"]

[tool.poetry.dependencies]
python = "^3.8"
fastapi = "^0.63.0"
uvicorn = "^0.13.4"
odmantic = "^0.3.4"
requests = "^2.25.1"
typer = "^0.3.2"
httpx = "^0.17.1"
aiofiles = "^0.6.0"
iopath = "^0.1.8"
python-multipart = "^0.0.5"
celery = "^5.0.5"
redis = "^3.5.3"
raven = "^6.10.0"
orjson = "^3.5.2"
Jinja2 = "^2.11.3"
pydantic = {extras = ["dotenv", "email"], version = "^1.8.1"}
phonenumbers = "^8.12.21"
vkbottle = "^2.7.12"
#pip install -U https://github.com/timoniq/vkbottle/archive/master.zip
Pillow = "^8.2.0"
imageio = "^2.9.0"
pandas = "^1.2.4"
annoy = "^1.17.0"
[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

....
search_index = AnnoyIndex(f, 'angular')
search_index.load('models/search_index.ann')

predict_server_url = settings.NEURAL_API

@celery_app.task
def search_items(file):
    def search_similar_images(image_vector, n=20):
        print('--- start search...')
        similar_img_ids, scores = search_index.get_nns_by_vector(
            image_vector,
            n,
            search_k=-1,
            include_distances=True
        )
        print(similar_img_ids, scores)

        print('--- end search')
        return similar_img_ids, scores

    assert file

    if not (type(file) is bytearray or type(file) is bytes):
        try:
            response = urllib.request.urlopen(file)
            b = response.read()
            del response
        except:
            with open(file, "rb") as image:
                f = image.read()
                b = bytearray(f)
                del f
    else:
        b = file

    # send to neural precitor and get vector for image
    response = requests.post(url=predict_server_url, files={"file": b})

    del b

    response_model = ResponseSingleLabel(**response.json())

    # search similar
    similar_img_ids, scores = search_similar_images(response_model.vector[0])

    # select from db by similar ID
    photos = db['photos'].find({'search_index': {"$in": similar_img_ids}}, {'_id': 1, 'image_id': 1, 'post_id': 1, 'search_index': 1})

    return photos

if __name__ == "__main__":
    file_path = 'data/test_image/1.jpg'

    x = search_items(file_path)
    print(x)

def read_mongo(collection, query={}, limit=1000):
    """ Read from Mongo and Store into DataFrame """

    # Make a query to the specific DB and Collection
    cursor = db[collection].find(query).limit(limit)

    # Expand the cursor and construct the DataFrame
    df = pd.DataFrame(list(cursor))

    return df

def create_search_index(ntree=100, limit=1000, vector_size=512, index_path='models/search_index.ann'):
    from annoy import AnnoyIndex

    photos_df = read_mongo('photos', query={"status": "predicted"}, limit=limit)

    search_index = AnnoyIndex(vector_size, metric='angular')

    for i, row in tqdm(photos_df.iterrows()):
        search_index.add_item(i, row['imagenet'])

    search_index.build(ntree, n_jobs=-1)
    search_index.save(index_path)
erikbern commented 3 years ago

no idea, is it possible the index doesn't fit in memory? that can cause dramatic slowdown

mmaktify commented 2 years ago

I am seeing the same problem. It appears inside docker Annoy isn't able to access multiple cores even if the docker has access to them. I'm looking into how I can fix it.

erikbern commented 2 years ago

Annoy doesn't use multiple cores (although you can use multiple threads yourself)

I think the slowdown is probably caused by swapping. Try running the get_nns_ call multiple times. Does it get faster and faster? It should.

barrycarey commented 2 years ago

I'll just add it works fine in Docker, assuming your index fits in memory. I've been running it in Docker for almost 2 years. My index consists of over 300 million items and searches are generally ~200ms. If my index exceeds available memory searers are 30+ seconds.