[Bug]: 'await' used outside function

erlebach commented 2 weeks ago

Bug Description

I am trying to use one of the llama_datasets found at https://llamahub.ai/l/llama_datasets/Evaluating%20LLM%20Survey%20Paper%20Dataset?from=llama_datasets. I tried running the code provided:

and sleep_time_in_seconds=15 (as of December 2023.)                      #
from llama_index.core.llama_dataset import download_llama_dataset
from llama_index.core.llama_pack import download_llama_pack
from llama_index.core import VectorStoreIndex

# download and install dependencies for benchmark dataset
rag_dataset, documents = download_llama_dataset(
    "EvaluatingLlmSurveyPaperDataset", "./data"
)

# build basic RAG system
index = VectorStoreIndex.from_documents(documents=documents)
query_engine = index.as_query_engine()

# evaluate using the RagEvaluatorPack
RagEvaluatorPack = download_llama_pack(
    "RagEvaluatorPack", "./rag_evaluator_pack"
)
rag_evaluator_pack = RagEvaluatorPack(
    rag_dataset=rag_dataset, query_engine=query_engine
)

benchmark_df = await rag_evaluator_pack.arun(
    batch_size=20,  # batches the number of openai api calls to make
    sleep_time_in_seconds=1,  # seconds to sleep before making an api call
)

and get the error:

 File "...../Building-Data-Driven-Applications-with-LlamaIndex/page_by_page_summaries/datasets/evaluating_llms/hub_evaluator.py", line 29
    benchmark_df = await rag_evaluator_pack.arun(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: 'await' outside function

Do you expect this provided code to run as is? If not, perhaps more explanation is necessary? Thanks.

(I also posted this message on the LlamaIndex github.)

Version

10.x

Steps to Reproduce

Run the script.

Relevant Logs/Tracbacks

None.

dosubot[bot] commented 2 weeks ago

The error you're encountering, 'await' used outside function, occurs because the await keyword can only be used inside an asynchronous function. To fix this, you need to wrap the asynchronous call in an async function. Here is the corrected code:

from llama_index.core.llama_dataset import download_llama_dataset
from llama_index.core.llama_pack import download_llama_pack
from llama_index.core import VectorStoreIndex

async def main():
    # download and install dependencies for benchmark dataset
    rag_dataset, documents = download_llama_dataset(
        "EvaluatingLlmSurveyPaperDataset", "./data"
    )

    # build basic RAG system
    index = VectorStoreIndex.from_documents(documents=documents)
    query_engine = index.as_query_engine()

    # evaluate using the RagEvaluatorPack
    RagEvaluatorPack = download_llama_pack(
        "RagEvaluatorPack", "./rag_evaluator_pack"
    )
    rag_evaluator_pack = RagEvaluatorPack(
        rag_dataset=rag_dataset, query_engine=query_engine
    )

    benchmark_df = await rag_evaluator_pack.arun(
        batch_size=20,  # batches the number of openai api calls to make
        sleep_time_in_seconds=1,  # seconds to sleep before making an api call
    )
    print(benchmark_df)

# Run the main function
import asyncio
asyncio.run(main())

This code defines an async function main() where the await keyword is used, and then runs this function using asyncio.run(main()). This should resolve the SyntaxError you encountered.

To continue talking to Dosu, mention @dosu.

logan-markewich commented 2 weeks ago

In python, async only works inside another async function or in a notebook cell

This specific example probably assumed one of those

Dosu gave a great answer above

erlebach commented 2 weeks ago

Thank you, Logan, and Dosu.

erlebach commented 2 weeks ago

Here is a related question. I am running the RagEvaluator pack, and at 85%, I got a rate error. The pack uses OpenAI. I wish to change the model. I assume, I have to go into the pack itself to do that. I notice that the RagEvaluator class allows the specification of a llm, which I will handle. At this time, I use an embedding model from HuggingFace. The RagEvaluator constructor does not allow the specification of an embedder. I would suggest changing the example code to allow it. Just my two cents.

run-llama / llama_index