ropensci / parzer

Parse geographic coordinates
https://docs.ropensci.org/parzer
Other
63 stars 6 forks source link

Cpp optimisation - passing by reference, using const, avoiding Rcpp functions in cpp code #44

Open AlbanSagouis opened 2 years ago

AlbanSagouis commented 2 years ago

Description

This is a general attempt at speeding up parzer by passing objects by reference and now preparing work with @mpadge, decreasing Rcpp functions use in cpp code and moving toward replacing Rcpp by cpp11

AlbanSagouis commented 1 year ago

Ready to merge @mpadge

Issues to solve when switching to cpp11:

AlbanSagouis commented 1 year ago

In this branch I kept old and new versions of many functions side by side in scripts (eg pz_parse_lat and pz_parse_lat_old). I used this to benchmark the effects of the changes made to the code. @mpadge Do we agree that I should delete all the old code before merging?

mpadge commented 1 year ago

In this branch I kept old and new versions of many functions side by side in scripts (eg pz_parse_lat and pz_parse_lat_old). I used this to benchmark the effects of the changes made to the code. @mpadge Do we agree that I should delete all the old code before merging?

Not necessarily. More importantly: It'd be good to document the benchmarking somewhere, so the old and new versions should be kept until that has been done. I suggest documenting in an issue, rather than a PR, because it's then easier to find at a later stage. The documentation should also be a full reprex starting with a gert::git_clone() call, and dumping gert::git_commit_id() (there is currently no way to use gert to checkout a specific commit, unfortunately).

The old code should then only be deleted once that's been done. And to make it easier to reproduce the results in the future, I'd suggest merging first to ensure the "old" versions are retained in the protected ("master") branch, rather than in a branch which we might later inadvertendly delete. Does that make sense?

AlbanSagouis commented 1 year ago

In this branch I kept old and new versions of many functions side by side in scripts (eg pz_parse_lat and pz_parse_lat_old). I used this to benchmark the effects of the changes made to the code. @mpadge Do we agree that I should delete all the old code before merging?

Not necessarily. More importantly: It'd be good to document the benchmarking somewhere, so the old and new versions should be kept until that has been done. I suggest documenting in an issue, rather than a PR, because it's then easier to find at a later stage. The documentation should also be a full reprex starting with a gert::git_clone() call, and dumping gert::git_commit_id() (there is currently no way to use gert to checkout a specific commit, unfortunately).

I create the issue #46 where I pasted reproducible code and results documenting the benchmarking. I did not use gert but directly built the package from the commit we are interested in:

remotes::install_github("albansagouis/parzer@2f8ede4c909929aacc491c1d7e9d63e94c7e5312")

microbenchmark::microbenchmark(...

@mpadge do you also think that this is reproducible?

mpadge commented 1 year ago

Yep, i was able to reproduce it. Thanks!