Open ssa38 opened 10 months ago
By default --pre_load_embedding_model=True
so every collection should use the same embedding model, as long as the embedding model is the same. Let me check on it.
I created 7 collections total, pdf or image to each. I only saw GPU increase when adding image since uses image model. But I'm only seeing up to 4.9GB after all 7 collections with no increase for each new collection.
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1548 G /usr/lib/xorg/Xorg 156MiB |
| 0 N/A N/A 2236 G /usr/lib/xorg/Xorg 914MiB |
| 0 N/A N/A 2375 G /usr/bin/gnome-shell 127MiB |
| 0 N/A N/A 7204 G /usr/bin/nvidia-settings 0MiB |
| 0 N/A N/A 7567 G gnome-control-center 4MiB |
| 0 N/A N/A 8532 G ...1815146,10427758287291803364,262144 159MiB |
| 0 N/A N/A 289037 G ...ures=SpareRendererForSitePerProcess 35MiB |
| 0 N/A N/A 557082 G obs 37MiB |
| 0 N/A N/A 622485 C ...niconda3/envs/h2ollm/bin/python3.10 4910MiB |
| 1 N/A N/A 1548 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 2236 G /usr/lib/xorg/Xorg 4MiB |
+---------------------------------------------------------------------------------------+
Maybe it's when making db via make_db that there's some issue...
After I got the out of memory I started adding collections one by one and checking the memory usage. Below are the screenshots after adding 4th, 5th and 6th collection. Every time it was 1.3 -1.7GB of added memory usage.
I'm trying to run generate.py with a few collections I created with make_db. I can't add more than 6 collections with my current configuration (Tesla T4 with 16GB of VRAM) because I run out of memory. Seems like each collection uses up to 1.5GB of memory no matter if it takes 100KB on the disk. Is there a way to reduce the memory allocation for collections?
My CLI:
I make shared collections the following way: