Performance improvement for pad()

I was profiling some code that calls padr::pad() and found that span_all_groups() was using a significant chunk of the total time. This proposed change converts span_all_groups() to use purrr::map2 and tidyr::unnest() to accomplish the same thing. In my tests it is ~2x faster than the original span_all_groups() on my M1 Max Apple laptop and ~3x faster on the virtual Linux machine I use for work. It's a small absolute improvement on my Mac (<0.5sec) but a more noticeable improvement on my slower work Linux machine (~1.5sec), at least in my context, where I'm doing some data processing while loading a Shiny app.

A tradeoff is the additional imports of purrr::map2 and tidyr::unnest. I noticed that padr seems to stick to base R as much as possible, so I'll understand if you're reluctant to add the additional dependencies.

Unrelated, but I think the existing stop_int64() call isn't actually doing anything, since it's being called on the list output from split() and so it's looking at the classes of the individual list elements rather than the columns in the original data frame.

All existing tests in the package are passing with this change.

EdwinTh / padr

Performance improvement for pad() #89