Closed elizagrames closed 5 years ago
Tokens is really slow; there has got to be an existing matching algorithm that can deduplicate the results much more quickly
Deduplication should also be reciprocal if using a token approach, which is probably not a good approach in the first place.
This is now covered by the synthesisr package.
Tokens is really slow; there has got to be an existing matching algorithm that can deduplicate the results much more quickly