hkarp1 / messy.cats

Other
0 stars 2 forks source link

cat_join deleting entries with no match #4

Open abhennessy99 opened 2 years ago

abhennessy99 commented 2 years ago

If there is no match below the threshold in clean_vector, then in the cat_join output the values without a good enough match are just deleted and replaced with NA

data("clean_caterpillars") data("messy_caterpillars")

cat_join(messy_df = messy_caterpillars, clean_df = clean_caterpillars, by = c("CaterpillarSpecies", "species"), method="jaccard", threshold = .49,join="full")

abhennessy99 commented 1 year ago

added this to the testing document, will try and fix tomorrow