Closed rwilson8 closed 1 year ago
Sounds good. Are you interested in submitting a PR?
When I wrote this function years ago, I was considering mostly valid column name that simply used a different naming convention. But you're right, we're using this function a lot on Excel spreadsheets where the the author never intended the variable names to be used in a database or programming language.
@wibeasley I tried, but it said it couldn't find the Master branch. (I assume because it's now called "main"). I guess I'll need you to do the update.
Yea, I'll take care of it. I posted this about the same time as your message above: https://github.com/OuhscBbmc/OuhscMunge/pull/128#issuecomment-1511937736
@rwilson8, I touched up some things in #130. I think they are all consistent with your goals. Tell me if not. Thanks again for thinking of these expanded use case.
@wibeasley I tested it out with some of my messier column names, and it worked great. Thank you!
Is your feature request related to a problem? Please describe.
I regularly encounter variable names that OuhscMunge::snake_case(), and by extension OuhscMunge::column_rename_headstart(), can't handle, so I'm proposing new regex. Here are the original function and a messy column name:
Describe the solution you'd like
Currently the function converts periods to underscores. I want it to convert all punctuation and spaces to underscores (except apostrophes so that contractions don't get split up). Also, it currently converts 2 consecutive underscores into 1, and I want it to convert an arbitrary amount of consecutive underscores into 1 and then remove leading and trailing underscores. Here is my proposed alternative:
Line 1 removes apostrophes. Line 2 converts all remaining punctuation or space characters to underscores. Lines 3 and 4 are unchanged. Line 5 reduces any amount of consecutive underscores to 1. Line 6 removes leading and trailing underscores.
Describe alternatives you've considered snakecase::to_snake_case() and janitor::clean_names().
Additional context N/A