Closed joshyam-k closed 9 months ago
Nice. Thanks for the example. It runs correctly for me. I assume this gives substantial speedup on the 35 million points in NV?
spxyext <- spxyext[!duplicated(spxyext[[xy.uniqueid]]), ]
instead of
spxyext <- unique(sf::st_join(sppltx, polyv))
probably helps too even without parallel?
Looks good to merge.
Above a million rows I was consistently seeing 5-10x speedups. And yes, reworking the removal of duplicate rows definitely speeds things up quite a bit even in the non parallel case.
Here's a reproducible example that just shows (at least in this example) that by parallelizing we don't change anything about the actual output from the function, we just change how we get there. I should also note that since the dataset is only about 50 rows here, the parallelization method is actually a tick slower as we'd expect.