PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
Apache License 2.0
19.8k stars 2.2k forks source link

running out of resources? #25

Open vipervs opened 1 year ago

vipervs commented 1 year ago

i have the following problem and im on a MacBook Air M2 with 16GB Ram

➜ localGPT git:(main) ✗ python run_localGPT.py --device_type cpu Running on: cpu load INSTRUCTOR_Transformer max_seq_length 512 Using embedded DuckDB with persistence: data will be stored in: /Users/andi/localGPT/DB Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s][1] 78030 killed python run_localGPT.py --device_type cpu /Users/andi/miniconda3/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

freakynit commented 1 year ago

Same... process getting killed even with 16GB RAM instance (using cpu device_type)

xingmolu commented 1 year ago

Same...

wiseflat commented 1 year ago

Hi guys! Same thing here. I have a basic Mac mini M1 with 8Go Ram

(env) me@mini localGPT % python ingest.py --device_type cpu
Loading documents from /Users/me/Projects/GPTLike/localGPT/localGPT/SOURCE_DOCUMENTS
Loaded 1 documents from /Users/me/Projects/GPTLike/localGPT/localGPT/SOURCE_DOCUMENTS
Split into 72 chunks of text
load INSTRUCTOR_Transformer
max_seq_length  512
Using embedded DuckDB with persistence: data will be stored in: /Users/me/Projects/GPTLike/localGPT/localGPT/DB

(env) me@mini localGPT % python run_localGPT.py
Running on: cuda
load INSTRUCTOR_Transformer
max_seq_length  512
Using embedded DuckDB with persistence: data will be stored in: /Users/me/Projects/GPTLike/localGPT/localGPT/DB
Downloading tokenizer.model: 100%|█████████████████████████████████████████████████████████████████████| 500k/500k [00:00<00:00, 15.2MB/s]
Downloading (…)cial_tokens_map.json: 100%|███████████████████████████████████████████████████████████████| 411/411 [00:00<00:00, 1.56MB/s]
Downloading (…)okenizer_config.json: 100%|███████████████████████████████████████████████████████████████| 715/715 [00:00<00:00, 11.0MB/s]
Downloading (…)lve/main/config.json: 100%|███████████████████████████████████████████████████████████████| 582/582 [00:00<00:00, 2.01MB/s]
Downloading (…)model.bin.index.json: 100%|███████████████████████████████████████████████████████████| 26.8k/26.8k [00:00<00:00, 40.3MB/s]
Downloading (…)l-00001-of-00002.bin: 100%|███████████████████████████████████████████████████████████| 9.98G/9.98G [02:06<00:00, 79.1MB/s]
Downloading (…)l-00002-of-00002.bin: 100%|███████████████████████████████████████████████████████████| 3.50G/3.50G [00:43<00:00, 80.6MB/s]
Downloading shards: 100%|███████████████████████████████████████████████████████████████████████████████████| 2/2 [02:50<00:00, 85.02s/it]
[1]    42799 killed     python run_localGPT.py
(env) me@mini localGPT % /opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

(env) me@mini localGPT % python run_localGPT.py
Running on: cuda
load INSTRUCTOR_Transformer
max_seq_length  512
Using embedded DuckDB with persistence: data will be stored in: /Users/me/Projects/GPTLike/localGPT/localGPT/DB
[1]    43034 killed     python run_localGPT.py
(env) mga@mini localGPT % /opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
johnm-starling commented 1 year ago

I had the same happen last night. There are tips in another issue which point out the primary issue, at least in my case, which is the memory required to load the shards exceeded the RAM and needed around 34GB of swap space but my drive only had around 18GB free. My boot drive was low on storage so it couldn't allocate the swap and threw the error that you are showing. I freed some space up so I had around 40GB available now it runs.

I had seen anecdotal references to doing 'conda update conda' and 'conda update --all' which had the same error message but neither of those was the actual cause in my case.

If you brew installed, trying install and running htop in a second terminal and you can see the swap being grabbed and when the swap bar hits the right side you will see the error message.

Screenshot 2023-06-06 at 9 51 54 AM

Again, there can be several causes for this error but this is something you can start with. As a reference I did follow the steps in the readme for the M1/M2 and ran inference with device type set to mps.

jxmai commented 1 year ago

There's a config recently being added which could help solving this problem. https://github.com/PromtEngineer/localGPT/commit/6be35c95d932d448f523e3c095269e3f52f6678e

max_memory={0: "15GB"} # Uncomment this line with you encounter CUDA out of memory errors