Creating a Test Dataset fails with IndexError: list index out of range

Issue

I'm just trying to generate a simple test dataset using Ragas and facing issues where I keep getting the IndexError: list index out of range. I want to understand if I'm making any fundamental mistake as usually I have my own custom workflow to generate these datasets but thought of exploring Ragas's features for this recently besides just evaluation

Ragas version: 0.2.3 Python version: 3.10

Code

from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings

generator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o")) 
generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())

from ragas.testset import TestsetGenerator

generator = TestsetGenerator(llm=generator_llm, 
                             embedding_model=generator_embeddings)
eval_dataset = generator.generate_with_langchain_docs(processed_docs[:3], 
                                                      testset_size=3)

and processed_docs[:3] is just some simple documents as follows

[Document(metadata={'title': 'Machine Learning', 'id': 1}, page_content='Machine learning is a field of artificial intelligence focused on enabling systems to learn patterns from data. Algorithms analyze past data to make predictions or classify information. Popular applications include recommendation systems and image recognition.'),
 Document(metadata={'title': 'Deep Learning', 'id': 2}, page_content='Deep learning is a subset of machine learning utilizing neural networks with many layers. It excels in complex tasks like image and speech recognition. Convolutional and recurrent neural networks are among the common architectures used.'),
 Document(metadata={'title': 'Natural Language Processing (NLP)', 'id': 3}, page_content='NLP is a branch of AI that enables computers to understand, interpret, and generate human language. Techniques include tokenization, stemming, and sentiment analysis. Applications range from chatbots to language translation services.')]

Traceback

Generating Scenarios:   0%
 0/3 [00:00<?, ?it/s]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[<ipython-input-34-5896c6d0aa07>](https://localhost:8080/#) in <cell line: 5>()
      3 generator = TestsetGenerator(llm=generator_llm, 
      4                              embedding_model=generator_embeddings)
----> 5 eval_dataset = generator.generate_with_langchain_docs(processed_docs[:3], 
      6                                                       testset_size=3)

18 frames
[/usr/lib/python3.10/random.py](https://localhost:8080/#) in <listcomp>(.0)
    517                 floor = _floor
    518                 n += 0.0    # convert to float for a small speed improvement
--> 519                 return [population[floor(random() * n)] for i in _repeat(None, k)]
    520             try:
    521                 cum_weights = list(_accumulate(weights))

IndexError: list index out of range

Would appreciate it if I can get some insights on if I'm fundamentally using this feature wrong or there is a deeper issue here.

explodinggradients / ragas