semi_join(x, y, by = "z") is faster than using filter(x, z %in% y$z), except for when there are multiple filters applied at once, in which case it is often faster to have one call to filter() rather than use filter() %>% semi_join().
There were somewhere around a hundred places where it was faster to replace with semi_join() or anti_join(). I also added this to the speed tips but am not planning to enforce it via tests because of the exception.
Based on #1068
semi_join(x, y, by = "z")
is faster than usingfilter(x, z %in% y$z)
, except for when there are multiple filters applied at once, in which case it is often faster to have one call tofilter()
rather than usefilter() %>% semi_join()
.There were somewhere around a hundred places where it was faster to replace with
semi_join()
oranti_join()
. I also added this to the speed tips but am not planning to enforce it via tests because of the exception.