Closed boogheta closed 6 months ago
When testing with no query, we can see there are matches such as ".", "PI.", "Rev.", "Prob.", "(cons.", etc.
I'm wondering whether we should trash before embedding them all too short phrases with less than 8 characters for instance.
cc @jimenaRL
Done with new parameter in corpus config "min_num_characters"
When testing with no query, we can see there are matches such as ".", "PI.", "Rev.", "Prob.", "(cons.", etc.
I'm wondering whether we should trash before embedding them all too short phrases with less than 8 characters for instance.
cc @jimenaRL