process.extractOne does not match fuzz.ratio

These are the docs of process.extract:

Select the best match in a list or dictionary of choices. Find best matches in a list or dictionary of choices, return a list of tuples containing the match and its score. If a dictionary is used, also returns the key for each match. Arguments: query: An object representing the thing we want to find. choices: An iterable or dictionary-like object containing choices to be matched against the query. Dictionary arguments of {key: value} pairs will attempt to match the query against each value. processor: Optional function of the form f(a) -> b, where a is the query or individual choice and b is the choice to be used in matching. This can be used to match against, say, the first element of a list: lambda x: x[0] Defaults to fuzzywuzzy.utils.full_process(). scorer: Optional function for scoring matches between the query and an individual processed choice. This should be a function of the form f(query, choice) -> int. By default, fuzz.WRatio() is used and expects both query and choice to be strings. limit: Optional maximum for the number of elements returned. Defaults to 5. Returns: List of tuples containing the match and its score. If a list is used for choices, then the result will be 2-tuples. If a dictionary is used, then the result will be 3-tuples containing the key for each match. For example, searching for 'bird' in the dictionary {'bard': 'train', 'dog': 'man'} may return [('train', 22, 'bard'), ('man', 0, 'dog')]

They state, that the default scorer for process.extract is fuzz.WRatio, which will give different results than fuzz.ratio. If you want to use fuzz.ratio you can specify this using the scorer argument. Beside this fuzz.ratio does not preprocess strings before matching them, while process.extract does preprocess them by default using fuzzywuzzy.utils.full_process(). So if you want to have similar results to fuzz.ratio this behaviour should be disabled using the processor argument.

process.extract(stringToMatch, possibleResults, scorer=fuzz.ratio, processor=None)

Other process functions like process.extractOne use similar defaults.

seatgeek / fuzzywuzzy

process.extractOne does not match fuzz.ratio #288