Closed brunopistone closed 5 months ago
Hey @brunopistone, thanks for bringing this to our notice. We will raise a fix ASAP, but feel free to raise a PR if you like to contribute :)
Hello @shahules786 , opened the PR #300 which contains the fixes + added compatibility to Amazon API Gateway
What is situation with this now at least testset generation returns
DataRow = namedtuple(
"DataRow",
[
"question",
"ground_truth_context",
"ground_truth",
"question_type",
"episode_done",
],
)
which does not match columns that Metrics want.
Its working ...
`from datasets import Dataset, Value, Sequence
csv_file_path = 'eval.csv'
import pandas as pd df = pd.read_csv(csv_file_path)
df['contexts'] = df['contexts'].apply(lambda x: eval(x) if isinstance(x, str) else [])
data_dict = { 'question': df['question'].tolist(),
'contexts': df['contexts'].tolist(),
'answer': df['answer'].tolist(),
'evolution_type': df['evolution_type'].tolist(),
'episode_done': df['episode_done'].tolist()
}
fiqa_dataset = Dataset.from_dict(data_dict)
print(fiqa_dataset) `
I tried @pankssid way by converting my own DF to dict and then Dataset. It's still not working.
my dict:
{'question': ['What is the purpose of Network Analysis?'],
'ground_truths': ['Network Analysis is conducted to understand connections and distances between data points by arranging data in a network structure.'],
'contexts': [["list of potentially relevant individuals. In the ''free list'' approach, they are asked to recall individuals without seeing a list (Butts 2008, p.20f).\\n* '''Data Analysis''': When it comes to analyzing the gathered data, there are different network properties that researchers are interested in in accordance with their research questions. The analysis may be qualitative as well as quantitative, focusing either on the structure and quality of connections or on their quantity and values. (Marin & Wellman 2010, p.16; Butts 2008, p.21f). The analysis can focus on \\n** the quantity and quality of ties that connect to individual nodes\\n** the similarity between different nodes, or\\n** the structure of the network as a whole in terms of density, average connection length and strength or network composition.\\n* An important element of the analysis is not just the creation of quantitative or qualitative insights, but also the '''visual representation''' of the network. For",
'Analysis gained even more traction through the increasing application in fields such as geography, economics and linguistics. Sociologists engaging with Social Network Analysis remained to come from different fields and topical backgrounds after that. Two major research areas today are community studies and interorganisational relations (Scott 1988; Borgatti et al. 2009). However, since Social Network Analysis allows to assess many kinds of complex interaction between entities, it has also come to use in fields such as ecology to identify and analyze trophic networks, in computer science, as well as in epidemiology (Stattner & Vidot 2011, p.8).\\n\\n\\n== What the method does ==\\n"Social network analysis is neither a theory nor a methodology. Rather, it is a perspective or a paradigm." (Marin & Wellman 2010, p.17) It subsumes a broad variety of methodological approaches; the fundamental ideas will be presented hereinafter.\\n\\nSocial Network Analysis is based on',
'style="width: 33%"| \\\'\\\'\\\'[[:Category:Past|Past]]\\\'\\\'\\\' || style="width: 33%"| \\\'\\\'\\\'[[:Category:Present|Present]]\\\'\\\'\\\' || [[:Category:Future|Future]]\\n|}\\n<br/>__NOTOC__\\n<br/>\\n\\n\\\'\\\'\\\'In short:\\\'\\\'\\\' Social Network Analysis visualises social interactions as a network and analyzes the quality and quantity of connections and structures within this network.\\n\\n== Background ==\\n[[File:Scopus Results Social Network Analysis.png|400px|thumb|right|\\\'\\\'\\\'SCOPUS hits per year for Social Network Analysis until 2019.\\\'\\\'\\\' Search terms: \\\'Social Network Analysis\\\' in Title, Abstract, Keywords. Source: own.]]\\n\\n\\\'\\\'\\\'One of the originators of Network Analysis was German philosopher and sociologist Georg Simmel\\\'\\\'\\\'. His work around the year 1900 highlighted the importance of social relations when understanding social systems, rather than focusing on individual units. He argued "against understanding society as a mass of individuals who']],
'answer': [' Network Analysis is a method to visual and analyze social interactions, such as connections between individuals, to answer research questions in fields such as sociology, ecology, computer science, and more. It can be qualitative or quantitative, and can focus on the structure, quality, or quantity of connections. It can also be visual, to make the network and connections more understanding.']}
Still got error :
ValueError: Dataset feature "ground_truths" should be of type Sequence[string], got <class 'datasets.features.features.Value'>
@stepkurniawan "ground_truth" is expecting a list of strings as data types, similar to "contexts". I believe you need to convert your 'ground_truths' column in DF to a list object. Something like DF['ground_truths'] = DF['ground_truths'].apply(lambda x: [x]). Then it will work like a charm.
I thought i was lazying out of reading some documentation - since it mentions synthetic data generation and the next step mentions evaluation but does not refer to the generated document.
Can we consider moving Generate a Synthetic Test Set
from the Get Started
page to lets say the core concepts until this is fixed? I guess more folks may get confused when reading - it feels like the evaluation is a continuation of the synthetic generation.
Just commenting that I found the same thing confusing 👆
Hey @mtharrison let me go through this thread today and raise a fix by EOD.
Bros @mtharrison @koshyviv @stepkurniawan @brunopistone Does this image make sense for you? Is it good enough to add to docs (my drawing skills are subpar as you can see , haha)
@shahules786 this image does very much answer the questions that I had yes! Thank you.
I took at look at your PR too, I think it's an improvement however I would have a couple of other points:
contexts
field. It is unclear to me what these could be useful for if the intended contexts
are to come from my rag pipeline. This could be a point of confusion for others too?contexts
is so that you can simply take a generated test set and run it through evaluate()
to quickly be able to understand what the library does, which I think in itself is somewhat useful for people new to Ragas.
answer
. If the aim of the tutorial is to have a quick way to generate and evaluate a synthetic dataset I would:answer
independent from the ground_truth
so that the dataset can feed right into evaluate()
answer
and context
for a real world evaluation datasetcontexts
from the generated datasetI think having contexts
but not answer
in the generated test set is what caused confusion for me.
Also I think if you were to add to that diagram that the question and ground_truth could alternatively be manually written (or supplemented) by people with domain knowledge of your data it would really explain 100% what this library does 😄
Something like this:
Bros @mtharrison @koshyviv @stepkurniawan @brunopistone Does this image make sense for you? Is it good enough to add to docs (my drawing skills are subpar as you can see , haha)
Your image skills are great! 😄
Just want to add to discussion, that from a newcomers perspective - the execution flow can still be improved. For e.g., i run the generate test cases script - gives output as CSV file. Prepare running the evaluation script - (but here) looks for a Dataset template.
I think if we have a flow as below (which i feel was the primary motivation behind the issue), users may be able to realize the benefits of the library quite quickly:
I'm not sure if am making complete sense, but the hurdle between disconnected inputs/outputs of the Test generation/evaluation was the primary concern for me.
Currently, I'm using @pankssid and others version to convert the csv into a Dataset
def get_file_dataset(path="test.csv"):
df = pd.read_csv(path)
df['contexts'] = df['contexts'].apply(lambda x: eval(x) if isinstance(x, str) else [])
data_dict = {
'question': df['question'].tolist(),
'ground_truth': df['ground_truth'].astype(str).tolist(),
'contexts': df['contexts'].tolist(),
'evolution_type': df['evolution_type'].tolist(),
'episode_done': df['episode_done'].tolist()
}
custom_dataset = Dataset.from_dict(data_dict)
return custom_dataset, data_dict
def generate_responses(test_questions, test_answers):
answers = []
contexts = []
for q in test_questions:
res = rag.ask_chain(q)
answer = res["response"]
context = res["docs"]
print(f"Question: {q}\n Answer: {answer}")
answers.append(answer)
contexts.append(context)
dataset_dict = {
"question": test_questions,
"answer": answers,
"contexts": contexts,
}
if test_answers is not None:
dataset_dict["ground_truth"] = test_answers
ds = Dataset.from_dict(dataset_dict)
return ds
def get_dataset():
_, ddict = get_file_dataset()
ds = generate_responses(ddict['question'],ddict['ground_truth'])
return ds
result = evaluate(
get_dataset(),
metrics=[
answer_relevancy,
faithfulness,
context_precision,
context_recall,
],
llm=ChatOpenAI(model="gpt-3.5-turbo"),
embeddings=langchain_embeddings
)
@koshyviv you can call to_dataset()
on the generation output to get a dataset:
testset = generator.generate_with_langchain_docs(data, test_size=50, distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})
dataset = testset.to_dataset()
Thanks! I'm still exploring the library and this helps. This could be a good addition to the Getting Started sections
Hey @mtharrison @koshyviv thanks for your input. I am with you on this 100% @mtharrison May I use your image for ragas docs?
@shahules786 sure thing!
Guys I just updated the docs with changes - hope that helps. If not feel free to reopen the issue. https://docs.ragas.io/en/latest/
Describe the bug I'm following the documentation for creating a synthetic dataset here. The generated dataset contains the following columns: "['question', 'context', 'answer', 'question_type', 'episode_done']".
The evaluation is requires:
I'm expecting that your library will provide all the steps for both data generation and evaluation, but seems that it's quite inconsistent.
Ragas version: 0.0.20 Python version: 3.10
Code to Reproduce Just follow your documentation
Error trace ValueError: Dataset feature "contexts" should be of type Sequence[string], got <class 'datasets.features.features.Value'>
Expected behavior The expected behaviour is that: