You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset

JPonsa commented 2 months ago

I got the error below while running the generator.generate_with_langchain_docs. How to set it up to use GPUs?

generator.generate_with_langchain_docs(docs, test_size=10, distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})

embedding nodes:  31%|███       | 8/26 [01:04<01:03,  3.55s/it]Traceback (most recent call last):
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 1110, in emit
    msg = self.format(record)
          ^^^^^^^^^^^^^^^^^^^
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 953, in format
    return fmt.format(record)
           ^^^^^^^^^^^^^^^^^^
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 687, in format
    record.message = record.getMessage()
                     ^^^^^^^^^^^^^^^^^^^
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 377, in getMessage
    msg = msg % self.args
          ~~~~^~~~~~~~~~~
TypeError: not all arguments converted during string formatting
Call stack:
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/threading.py", line 995, in _bootstrap
    self._bootstrap_inner()
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/concurrent/futures/thread.py", line 83, in _worker
    work_item.run()
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/langchain_community/llms/huggingface_pipeline.py", line 267, in _generate
    responses = self.pipeline(
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/transformers/pipelines/text_generation.py", line 240, in __call__
    return super().__call__(text_inputs, **kwargs)
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1167, in __call__
    logger.warning_once(
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/transformers/utils/logging.py", line 329, in warning_once
    self.warning(*args, **kwargs)
Message: 'You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset'
Arguments: (<class 'UserWarning'>,)
--- Logging error ---
Traceback (most recent call last):
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 1110, in emit
    msg = self.format(record)
          ^^^^^^^^^^^^^^^^^^^
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 953, in format
    return fmt.format(record)
           ^^^^^^^^^^^^^^^^^^
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 687, in format
    record.message = record.getMessage()
                     ^^^^^^^^^^^^^^^^^^^
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/logging/__init__.py", line 377, in getMessage
    msg = msg % self.args
          ~~~~^~~~~~~~~~~
TypeError: not all arguments converted during string formatting
Call stack:
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/threading.py", line 995, in _bootstrap
    self._bootstrap_inner()
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/concurrent/futures/thread.py", line 83, in _worker
    work_item.run()
  File "/shared/ucl/apps/python/3.11.3/gnu-4.9.2/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/langchain_community/llms/huggingface_pipeline.py", line 267, in _generate
    responses = self.pipeline(
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/transformers/pipelines/text_generation.py", line 240, in __call__
    return super().__call__(text_inputs, **kwargs)
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/transformers/pipelines/base.py", line 1167, in __call__
    logger.warning_once(
  File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/transformers/utils/logging.py", line 329, in warning_once
    self.warning(*args, **kwargs)
Message: 'You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset'
Arguments: (<class 'UserWarning'>,)

jjmachan commented 2 months ago

hey @JPonsa sorry about this

but what models are you using? by default we use LLMs and Embeddings which are openai - if seems like you have some transformers models

JPonsa commented 2 months ago

hi @jjmachan using vLLM. Now I don't remember the LLM, Either mistral7b or llama2-7b

jjmachan commented 2 months ago

did you get this fixed @JPonsa ?

but I'll test it with vLLM shortly and write a testcase for that too

JPonsa commented 2 months ago

@jjmachan , not sure, probably, I cannot test due to https://github.com/explodinggradients/ragas/issues/871

jjmachan commented 2 months ago

ohh understood, I'll prioritize that one first then, get that sorted for u 🙂

explodinggradients / ragas

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset #873