arkohut / pensieve

A passive recording project allows you to have complete control over your data. Automatically take screenshots of all your screens, index them, and save them locally.
Apache License 2.0
991 stars 43 forks source link

unusually high amount of memory #11

Open xyb opened 1 week ago

xyb commented 1 week ago

After running the serve process for an extended period, I noticed it was consuming an unusually high amount of memory—approximately 5.5GB on my laptop. This suggests a potential memory leak that we should investigate and resolve. Screenshot 2024-11-19 at 14 25 15

❯ ps aux|grep python3
xyb              16496  37.4  0.9 421123904 145216   ??  R     4:25PM  41:36.85 /Users/xyb/.virtualenvs/pensieve/bin/python3 -m memos.commands serve
xyb              16627  10.5  0.4 412157328  64816   ??  S     4:25PM   2:26.19 /Users/xyb/.virtualenvs/pensieve/bin/python3 -m memos.commands watch
xyb              17026   0.0  0.0 410825856    192   ??  S     4:26PM   0:00.03 /Users/xyb/.virtualenvs/pensieve/bin/python3 -c from multiprocessing.resource_tracker import main;main(15)
xyb              16956   0.0  0.0 410966144    192   ??  S     4:26PM   0:00.02 /Users/xyb/.virtualenvs/pensieve/bin/python3 -c from multiprocessing.resource_tracker import main;main(13)
xyb              16495   0.0  0.2 412606000  28592   ??  S     4:25PM   7:37.52 /Users/xyb/.virtualenvs/pensieve/bin/python3 -m memos.commands record
arkohut commented 1 week ago

Thanks for sharing that. Could please tell me the current config in your computer? And do you use the vlm feature?

As far as I know some embedding models may have issue about this. For example I have such issue when using https://huggingface.co/jinaai/jina-embeddings-v3

xyb commented 1 week ago

Here's my configuration:

base_dir: ~/.memos
database_path: database.db
default_library: screenshots
screenshots_dir: screenshots

server_host: 0.0.0.0
server_port: 8839

# Enable authentication by uncommenting the following lines
# auth_username: admin
# auth_password: changeme

default_plugins:
- builtin_ocr
- builtin_vlm

# using ollama as the vlm server
vlm:
  concurrency: 8
  endpoint: http://localhost:11434
  force_jpeg: true
  modelname: minicpm-v
  prompt: 请帮描述这个图片中的内容,包括画面格局、出现的视觉元素等
  token: ''

# using local ocr
ocr:
  concurrency: 8
  # this is not used if use_local is true
  endpoint: http://localhost:5555/predict
  force_jpeg: false
  token: ''
  use_local: true

# using local embedding
embedding:
  # this is not used if use_local is true
  endpoint: http://localhost:11434/api/embed
  model: arkohut/jina-embeddings-v2-base-zh
  num_dim: 768
  use_local: true
  use_modelscope: false
arkohut commented 6 days ago

截屏2024-11-21 14 26 04

After tracking the memory using in embedding.py, I found that the memory usage get a peak and then quickly fall down. But the monitor will keep showing the peak. Click the Details button can show the total usage looks like this:

截屏2024-11-21 14 30 58