Closed banagale closed 1 year ago
In fuzzywuzzy
there is both extract
and extractBests
with the difference that extractBests
has an additional score_cutoff
parameter. In RapidFuzz
I only have the extract
function which does provide the score_cutoff
argument and so is equivalent to extractBests
There are a couple of differences between RapidFuzz
and fuzzywuzzy
. In your specific case I assume you are using a function like WRatio
which defaults to force_ascii=True
. So your strings are preprocessed using utils.full_process(, force_ascii=True)
which runs str(sequence)
. This behaviour is not supported in rapidfuzz
, so you will need to perform this conversion yourself. This can be done e.g. like this:
process.extract(query, choices, processor=str)
or in case you want to use the preprocessing function:
def preprocess(seq):
return utils.default_process(str(seq))
process.extract(query, choices, processor=preprocess)
Thank you for that feedback, Max!
I'll have another run at this, and if I run into difficulty re-open this issue. I saw #333 and appreciate that, I had seen #26 and presumed the function previously existed but was obviated.
I am trying to drop-in replace a project that depends on the last version of
fuzzywuzzy
prior to the name change. This is needed after hitting this issue.The project uses
process.extractBests
. I noticed thatrapidfuzz
does not includeprocess.extractBests
.Is
process.extract
a drop in replacement for that old function?I tried using
process.extract
and realized that the project was relying on the__str__
of objects passed into thechoices
argument being read. Later in the code, the variable is used like an object. (this allowed the dev to easily use the object and refer to the string for comparison)rapidfuzz
does not seem to look at a given__str__
for an object. Is this on purpose? Or perhaps FW should not have done this?I mention the two above because I believe the goal is for the FW api to be fully available in RF. I do not know if the above use of the FW api was unusual or an anti-pattern though.