Closed JiggsUK closed 5 years ago
Okay! I created a new branch for us to work in.
I don't know regex super well, but the current one searches the user input for [\d\W] and if it finds any matches, it spits out the error message.
\d matches anything that's a digit \W matches anything that is NOT a digit or a letter
Thoughts on how to ignore (not match) the space and underscore characters?
I didn't realize \W would not match to à! Learned something new today.
Here's a fun cheat sheet. https://www.rexegg.com/regex-quickstart.html I assume we want to ignore all non-standard English characters, i.e., A through Z.
word = 'hello_world'
if not word.isalpha(): print("Enter a word. Letters only.")
Good shout, have you looked at which letters are included in an .isalpha
search?
Would the 3 words on the issue pass?
And now I know what 'good shout' means!
We could use the ^
key inside the regex pattern to limit matches to standard English letters.
Two side questions:
Should we allow a user to use a space or underscore character to represent a blank tile? Some word games have those.
Should we be checking that what the user entered is indeed a word? For example, if the user enters definately
instead of 'definitely' (or intentionally types gibberish) should we still display the score, or an error message saying that their input is not a word?
Okay... Something like this passes everything except special characters.
Should pass with: 1) hello_world 2) hig rise
Should not pass with:
1) àbbey
word_3 = 'hig rise'
word_3.lstrip('') if not word_3.isalpha(): print("Enter a word. You cannot use numbers or spaces.")
Should we allow a user to use a space or underscore character to represent a blank tile? Some word games have those.
As MrDayKwan suggests, we might want to consider changing what the blank character is in the dictionary. A space is hard to account for, but a symbol would be easier.
.isalpha() ensures that all characters are A through Z.
It's not just english alphabet, it checks latin characters too. àbbey fails because the à in not in our character dictionary. From the regex docs:
Perhaps we need a more specific regex, after we have decided what should represent a blank tile.
Should we be checking that what the user entered is indeed a word? For example, if the user enters definately instead of 'definitely' (or intentionally types gibberish) should we still display the score, or an error message saying that their input is not a word?
This would be a great development option, but I think it is too much for right now.
I would put in a vote for allowing _
or *
as a user substitution for a blank tile. What do you guys think?
This would be a great development option, but I think it is too much for right now.
Hehe, I'm too stubborn for that! Does the following stuff make sense? The first block of code could be put at the start of the file, and the elif statement inside the primary while loop near the end.
# open the .txt file of official scrabble words from 2015 using the 'read' status
base_word_text_file = open("Collins Scrabble Words (2015).txt", "r")
# check to make sure the file mode is read, then take each row from the text file and make it a list item
# using split on new lines to separate the text on each row, read the text, then add it to the base_word_list list
if base_word_text_file.mode == 'r':
base_word_list = base_word_text_file.read().split('\n')
base_word_text_file.close()
# elif statement to validate that input is a correctly spelled English word
# We use the base_word_list created from our .txt file
elif user_input.upper() not in base_word_list:
print(f'Sorry, {user_input} may be spelled incorrectly. Try again.')
@shyamcody has written a regex bit using re.compile('[A-Za-z]')
to check the input string. This seems to be working effectively for the use cases highlighted here. Just a heads-up for the group.
Nice @MrDayKwan! If you want to push it to the regex issue branch. I think we will also find it much easier to deal with if we put the .upper() on the end of the user input variable. Then we won't need to remember to put it to upper everytime we want to use it.
Edit: nakulkd beat me to it but here's the full code suggestion from @shyamcody:
import re
character_regex = re.compile('[a-zA-Z]')
def check_diff_characters(user_input):
if character_regex.search(user_input):
return True
else:
return False
What do you think? I think it'll work great, we just need to add an extra bit for the blank tile - which I would vote for a * as the value
sorry for the mishap; but I sent a wrong file to Jiggsuk. The original patch of code is:
import re character_regex=re.compile('[a-zA-Z]') def check_diff_characters(user_input): extra_characters=character_regex.sub(r'',user_input) if len(extra_characters)==0: return False else: return True
Alright, using @shyamcody 's regex and modifying it to include *
should do the trick! It's in the main.py file now. Also added a block to account for *
characters in the validation for 'is user input actually a word'.
OK, so it looks like we have got there with this issue. The scenarios at the top all pass - or rather catch an exception as they should - I believe the issue is resolved so I will close this. Well done everyone, thank you for your input!
When you run the program, try entering theses words:
àbbey hello_world hig rise
What happens? What could we do to fix this?
Just a quick note: webdotorg and saiyencoder are 6 hours behind the rest of you so try and make sure you allow them to have some input if you can