No progress after generating initial chain of prompts

ziptron commented 11 months ago

I am running this in Colab with their free tier GPU (15GB), using WizardLM-13B-1.0.ggmlv3.q5_K_S.bin.

I have been testing this out by generating some random PDFs from Wikipedia articles. I can parse about 50 pdfs and create an index in less than a minute. I then run the 'Interact' part and it quickly loads up the "Enter Question >>" prompt. I can then ask a question, and it seems to start compiling the chain. However, afterwards nothing happens.

The prompt below successful finds the PDF of (https://en.wikipedia.org/wiki/Olive_Edis) in my docs foler, and starts putting the prompt together, but then nothing happens.

My GPU usage remains low (2GB/15GB) and I can wait 30 minutes or longer and nothing else happens.

Any hints on how to diagnose this? What should I expect to happen next?

ENTER QUESTION >> What did Olive Edis own?
[chain/start] [1:chain:StuffDocumentsChain] Entering Chain run with input:
[inputs]
[chain/start] [1:chain:StuffDocumentsChain > 2:chain:LLMChain] Entering Chain run with input:
{
  "question": "What did Olive Edis own?",
  "context": "Olive Edis\nMary Olive Edis\nAutochrome self-portrait\nBorn\n3 September 1876\nDied\n28 December 1955 (aged 79)\nNationality\nBritish\nOccupation Photographer\nFrom Wikipedia, the free encyclopedia\nMary Olive Edis, later Edis-Galsworthy, (3 September 1876 –\n28 December 1955) was a British female photographer and\nsuccessful business woman who, throughout her career, owned\nseveral studios in London and East Anglia.[1]\nKnown primarily for her studio portrait photography, Edis’ sitters\nranged from royalty to politicians, to influential women, and local\nNorfolk fisherfolk. Edis was one of the first women to adopt the\nautochrome process professionally and became Britain’s first\nofficial female war photographer in 1919.[2]\nContents [hide]\n1 Life\n2 Career\n3 Legacy\n4 Gallery\n5 References\n6 External links\nLife [edit]\nEdis, born at 22 Wimpole Street, London, was the eldest daughter\nof Mary née Murray (1853–1931) and Arthur Wellesley Edis,\n\nWikimedia Commons has\nmedia related to Olive Edis.\nFrance and Flanders between 1918 and 1919 for the Imperial War Museum.[5][6][7] In 1920 she was\ncommissioned to create advertising photographs for the Canadian Pacific Railway and her autochromes of this\ntrip to Canada are believed to be some of the earliest colour photographs of that country.[8]\nThroughout her career Edis photographed many influential figures of early 20th century society. Notable\nexamples include authors Thomas Hardy (1914) and George Bernard Shaw (1936); prime ministers H. H.\nAsquith (1917–18) and David Lloyd George (1917) and the future King George VI (c.1920s). Edis\nphotographed many prominent women at a time of great change for the role of women in British society\nincluding Elizabeth Garrett Anderson (1909), Nancy Astor (1920) and Emmeline Pankhurst (1920). As well as\nfamous sitters, Edis produced many portraits of local working fisherman their families at her studios in North"
}
[llm/start] [1:chain:StuffDocumentsChain > 2:chain:LLMChain > 3:llm:CustomLlamaLangChainModel] Entering LLM run with input:
{
  "prompts": [
    "### Instruction:\nUse the following pieces of context to answer the question at the end. If answer isn't in the context, say that you don't know, don't try to make up an answer.\n\n### Context:\n---------------\nOlive Edis\nMary Olive Edis\nAutochrome self-portrait\nBorn\n3 September 1876\nDied\n28 December 1955 (aged 79)\nNationality\nBritish\nOccupation Photographer\nFrom Wikipedia, the free encyclopedia\nMary Olive Edis, later Edis-Galsworthy, (3 September 1876 –\n28 December 1955) was a British female photographer and\nsuccessful business woman who, throughout her career, owned\nseveral studios in London and East Anglia.[1]\nKnown primarily for her studio portrait photography, Edis’ sitters\nranged from royalty to politicians, to influential women, and local\nNorfolk fisherfolk. Edis was one of the first women to adopt the\nautochrome process professionally and became Britain’s first\nofficial female war photographer in 1919.[2]\nContents [hide]\n1 Life\n2 Career\n3 Legacy\n4 Gallery\n5 References\n6 External links\nLife [edit]\nEdis, born at 22 Wimpole Street, London, was the eldest daughter\nof Mary née Murray (1853–1931) and Arthur Wellesley Edis,\n\nWikimedia Commons has\nmedia related to Olive Edis.\nFrance and Flanders between 1918 and 1919 for the Imperial War Museum.[5][6][7] In 1920 she was\ncommissioned to create advertising photographs for the Canadian Pacific Railway and her autochromes of this\ntrip to Canada are believed to be some of the earliest colour photographs of that country.[8]\nThroughout her career Edis photographed many influential figures of early 20th century society. Notable\nexamples include authors Thomas Hardy (1914) and George Bernard Shaw (1936); prime ministers H. H.\nAsquith (1917–18) and David Lloyd George (1917) and the future King George VI (c.1920s). Edis\nphotographed many prominent women at a time of great change for the role of women in British society\nincluding Elizabeth Garrett Anderson (1909), Nancy Astor (1920) and Emmeline Pankhurst (1920). As well as\nfamous sitters, Edis produced many portraits of local working fisherman their families at her studios in North\n---------------\n\n### Question: What did Olive Edis own?\n### Response:"
  ]
}

---- Edit ----

This may be a resource issue with Google Colab. I'm now trying to run a different code all together and its also getting stuck at 2 GB of GPU usage and not actually outputting a result. I will try this again tomorrow.

snexus commented 11 months ago

Hi,

I never tried to run it on Google Colab, 15GB should be enough for this model - I can run it locally on 10GB VRAM card (with half of the layers offloaded to CPU). If you are still stuck - do you mind posting your model's section of config.yaml and I will try to reproduce it?

ziptron commented 11 months ago

Thanks for responding. I do think this may be a Colab issue, so I'll keep trying today and post results later.

By the way, stupid question, how do you know how many "layers" there are? I've been fiddling with the n_gpu_layers parameter, but I cannot quite understand what that means. Does 50 mean 50% (half), or is that a unit of layers? If you could point me towards some info on that I'd much appreciate it.

Thanks!

snexus commented 11 months ago

It is the absolute number of layers and depends on the actual model architecture. When the model is loaded, in this case using llamacpp, you can see it in the log (see screenshot attached).

So in the example below, the model consists of 43 layers, and 15 were offloaded to GPU. You can then check VRAM usage and adjust n_gpu_layers accordingly. You potentially will need more memory than it's currently stated, depending on the context length and the embedding model used (which also requires GPU in most cases)

ziptron commented 11 months ago

This screen shot made me realize that I am not offloading anything to the GPU. See mine below.

I had some errors while installing (see below). Should I try to resolve these errors you think? Or is there a different way to diagnose why I'm not offloading to the GPU?

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests==2.27.1, but you have requests 2.29.0 which is incompatible.
tensorflow 2.12.0 requires protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3, but you have protobuf 3.20.2 which is incompatible.
tensorflow-metadata 1.13.1 requires protobuf<5,>=3.20.3, but you have protobuf 3.20.2 which is incompatible.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.0.0 which is incompatible.
torchdata 0.6.1 requires torch==2.0.1, but you have torch 2.0.0 which is incompatible.
torchtext 0.15.2 requires torch==2.0.1, but you have torch 2.0.0 which is incompatible.
Successfully installed InstructorEmbedding-1.0.1 XlsxWriter-3.1.2 accelerate-0.19.0 argilla-1.13.3 auto-gptq-0.3.0 backoff-2.2.1 bitsandbytes-0.41.0 chromadb-0.3.26 clickhouse-connect-0.6.8 coloredlogs-15.0.1 cryptography-41.0.2 dataclasses-json-0.5.14 datasets-2.14.2 deprecated-1.2.14 dill-0.3.7 diskcache-5.6.1 einops-0.6.1 fastapi-0.95.1 filetype-1.2.0 gitdb-4.0.10 gitpython-3.1.32 h11-0.14.0 hnswlib-0.7.0 httpcore-0.16.3 httptools-0.6.0 httpx-0.23.3 huggingface-hub-0.16.4 humanfriendly-10.0 langchain-0.0.219 langchainplus-sdk-0.0.20 llama-cpp-python-0.1.77 llama-index-0.6.9 llmsearch-0.1.dev74+g7207a16.d20230801 loguru-0.7.0 lz4-4.3.2 marshmallow-3.20.1 monotonic-1.6 msg-parser-1.2.0 multiprocess-0.70.15 mypy-extensions-1.0.0 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 olefile-0.46 onnxruntime-1.15.1 openai-0.27.8 openapi-schema-pydantic-1.2.4 overrides-7.3.1 pdf2image-1.16.3 pdfminer.six-20221105 peft-0.4.0 posthog-3.0.1 protobuf-3.20.2 pulsar-client-3.2.0 pydeck-0.8.1b0 pympler-1.0.1 pymupdf-1.22.5 pypandoc-1.11 pypdf2-3.0.1 python-docx-0.8.11 python-dotenv-1.0.0 python-magic-0.4.27 python-pptx-0.6.21 pytz-deprecation-shim-0.1.0.post0 requests-2.29.0 rfc3986-1.5.0 rouge-1.0.1 safetensors-0.3.1 sentence-transformers-2.2.2 sentencepiece-0.1.99 smmap-5.0.0 sqlalchemy-1.4.48 starlette-0.26.1 streamlit-1.24.1 threadpoolctl-3.1.0 tiktoken-0.3.3 tokenizers-0.13.3 torch-2.0.0 torchvision-0.15.1 transformers-4.29.2 typer-0.7.0 typing-inspect-0.9.0 tzdata-2023.3 tzlocal-4.3.1 unstructured-0.7.8 uvicorn-0.23.2 uvloop-0.17.0 validators-0.20.0 watchdog-3.0.0 watchfiles-0.19.0 websockets-11.0.3 xxhash-3.3.0 zstandard-0.21.0

WARNING: The following packages were previously imported in this runtime:
  [google]
You must restart the runtime in order to use newly installed versions.

snexus commented 11 months ago

Sorry that you are facing problems.

It looks llamacpp was built without GPU support during the installation, that's why you don't see it in the output. Will need to investigate how to enable it in the Colab environment,

On a local GPU-enabled computer, assuming all the prerequisites are installed, llamacpp needs the flags described in https://github.com/ggerganov/llama.cpp#cublas in order to build with GPU support.

In this repository, these flags are set using setvars.sh before the installation (it is also described in README).

snexus commented 11 months ago

I've created a demo notebook on how to run it on Google Colab (free tier) - https://github.com/snexus/llm-search/blob/main/notebooks/llmsearch_google_colab_demo.ipynb

ziptron commented 11 months ago

Wow thanks so much! I tried this out this morning and it works well! I may not have been setting the variables (below) correctly, or at all to be honest.

%env CMAKE_ARGS="-DLLAMA_CUBLAS=on"

%env FORCE_CMAKE=1

Thanks for making this project and for your help.

snexus / llm-search

No progress after generating initial chain of prompts #20