Chalarangelo / 30-seconds-of-python

Short Python code snippets for all your development needs
https://www.30secondsofcode.org/python/p/1
Creative Commons Attribution 4.0 International
8.83k stars 1.26k forks source link

Several suggestions to improve 'words' #280

Closed learnbyexample closed 3 years ago

learnbyexample commented 3 years ago

Link: https://www.30secondsofcode.org/python/s/words. Existing snippet:

def words(str, pattern = '[^a-zA-Z-]+'):
    return list(filter(None, re.split(pattern, str)))
  1. Use re.findall instead of re.split and filter. This will also change to positive character class instead of negated one.
def words(str, pattern = '[a-zA-Z-]+'):
    return re.findall(pattern, str)
  1. str is a built-in function and raw-strings are preferred to define regular expression patterns
def words(ipstr, pattern = r'[a-zA-Z-]+'):
    return re.findall(pattern, ipstr)
  1. The description isn't precise. Hyphen is missing from "defaults to non-alpha as a regexp".
>>> def words(ipstr, pattern = r'[a-zA-Z-]+'):
...     return re.findall(pattern, ipstr)
... 
>>> words('good -123 -baz in-built foo-')
['good', '-', '-baz', 'in-built', 'foo-']
  1. You could add another example to show use of second argument, as well as illustrate how word boundaries can be useful. Note the use of raw-string, with normal strings, you'll need '\\b[a-zA-Z-]+\\b'
>>> words('good -123 -baz in-built foo-', r'\b[a-zA-Z-]+\b')
['good', 'baz', 'in-built', 'foo']
Trinityyi commented 3 years ago
  1. Yes, this seems like a valid improvement.
  2. Correct. s is more appropriate as per the guidelines and other snippet conventions, instead of ipstr.
  3. True, it could bedefaults to non-alpha and hyphens as regexp. (or alphanumeric and hypens if the regexp is swapped.
  4. Definitely worth doing that.

Overall, the suggestions are all solid. Could you open a PR for this, if you have the time? Remember to tag Fixes #280 in the description.

learnbyexample commented 3 years ago

Could you open a PR for this

https://github.com/30-seconds/30-seconds-of-python/blob/master/CONTRIBUTING.md says npm is needed, which I'm not willing to install/learn. I do not mind someone else who is familiar with this project to make the changes.