marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

Using functions to convert slot words into different forms #115

Closed guanqun-yang closed 2 years ago

guanqun-yang commented 2 years ago

I have some templates that look like the following

template = '{CITY1} is {MILE} miles from {CITY2} and {NumLessThan(MILE)} miles from {CITY3}.'

where

Any input is appreciated. Thanks in advance!

marcotcr commented 2 years ago

We don't support running a function that depends on the instantiation of the placeholders, but you can achieve the same effect by precomputing pairs or tuples with your function, and using that in the template. To use your example:

import random
import numpy as np
import munch
def NumLessThanInt(num1):
    return str(random.choice([x for x in range(6) if x < int(num1)]))
miles = [1, 2, 3, 4, 5]
# sample 1000 pairs
miles_pair = [munch.Munch({'orig': x, 'less': NumLessThanInt(x)}) for x in np.random.choice(miles, 1000)]
x = editor.template('{city1} is {mile.orig} miles from {city2} and {mile.less} miles from {city3}.', mile=miles_pair, nsamples=100)
x.data[:4]

['Tampa is 4 miles from San Juan and 0 miles from Riverside.', 'Virginia Beach is 5 miles from St. Petersburg and 2 miles from Denver.', 'Corpus Christi is 4 miles from Saint Paul and 1 miles from Columbus.', 'Buffalo is 5 miles from Laredo and 0 miles from San Antonio.']

I put it in a much just because I think it's nice to call mile.orig, but it can also be a list or tuple:

import random
import numpy as np
import munch
def NumLessThanInt(num1):
    return str(random.choice([x for x in range(6) if x < int(num1)]))
miles = [1, 2, 3, 4, 5]
miles_pair = [[str(x), NumLessThanInt(x)] for x in np.random.choice(miles, 1000)]
x = editor.template('{city1} is {mile[0]} miles from {city2} and {mile[1]} miles from {city3}.', mile=miles_pair, nsamples=100)
x.data[:4]

['Fort Worth is 4 miles from New York City and 2 miles from Richmond.', 'Atlanta is 3 miles from Glendale and 2 miles from Saint Paul.', 'Orlando is 1 miles from Santa Ana and 0 miles from Lubbock.', 'Glendale is 1 miles from Glendale and 0 miles from Tampa.']