Closed elenasamuylova closed 1 month ago
Along with the with_column
parameter, would the column we are getting the generated response from always be named the same? Or would we need it to resemble SemanticSimilarity
implementation with two arbitrary columns being compared with WordMatch
?
Add a New
WordMatch
Descriptor to EvidentlyAbout Hacktoberfest contributions: https://github.com/evidentlyai/evidently/wiki/Hacktoberfest-2024
Description:
Evidently already has an
IncludesWords()
descriptor that checks if the text containsany
(by default) orall
specified words, returning a True/False result for each row. However, this descriptor uses a single shared list of words for all rows.In some cases, such as when evaluating responses against specific ground truth answers, you may need a different list of words for each row. For example, you might want to check if generated responses contain the expected keywords for each row:
Example:
What to Implement:
The new WordMatch() descriptor should:
with_column
parameter: This column contains a list of words specific to each row.lemmatize
parameter. DefaultTrue
, to consider inflected and variant words. (Same asIncludesWords()
descriptor).any
orall
words present. (Same asIncludesWords()
descriptor):References:
IncludesWords
descriptor for vocabulary word check implementation.SemanticSimilarity
descriptor and theCustomPairColumnEval
template.