Comment(s): Great job writing all valid Python code without any errors or issues! I tested each one of your functions and class and in all cases received valid output and no translation errors.
Criteria 2: Implementation of Project Requirements
Score Level: 3/4
Comment(s): Nice job implementing each of the required set of functions and class for this project! Hopefully this programming "game" highlighted some of the ways that you might use Python for data analysis and discovery. I did want to give some feedback on a few things listed below.
get_average_sentence_length
It looks like you are a bit off on the results that I used to test this function out. I wasn't able to figure out the reason for the difference but I am guessing it has something to do with the 2nd to last line of code in your implementation. You are correct that we end up with something extra because of the period, but I think that impacts the number of sentences instead of the number of words. Anyway, here is the code I used for the test output along with the results I get with my implementation.
print(get_average_sentence_length("Have a nice day. Out of the three suspects, I think Gregg is the murderer? Any catlover must be a murderer!"))
print("-------------")
print(get_average_sentence_length(murder_note))
print(get_average_sentence_length(lily_trebuchet_intro))
print(get_average_sentence_length(myrtle_beech_intro))
print(get_average_sentence_length(gregg_t_fishy_intro))
Also, I thought I would provide my implementation as well for learning purposes.
def frequency_comparison2( table1, table2):
appearances = 0
mutual_appearances = 0
for key1 in table1.keys():
if key1 in table2.keys():
mutual_appearances += min(table1[key1],table2[key1])
appearances += max(table1[key1],table2[key1])
elif key1 not in table2.keys():
appearances += table1[key1]
for key2 in table2.keys():
if key2 not in table1.keys() :
appearances += table2[key2]
result = mutual_appearances/appearances
return result
percent_difference
I would recommend implementing this function assuming that the passed in values for just numbers, and then access your average sentence length from each TextSample when you call the function. Obviously either approach produces the same result, but in general its better to implement functions that are more generic and don't require knowledge of what is being passed in.
Everything else from a project requirements perspective looks great! Be sure to checkout the below review section (Criteria 4) for some recommendations and feedback on how you might further improve your Python skills
Criteria 3: Software Architecture
Score Level: 4/4
Comment(s): Though the project requirements instructed you to break apart your implementation into separate functions and classes, I'm glad to see that you followed through on this requirement! Hopefully this basic software architectural approach provided you with some guidance on how to structure code in the future. Imagine if you had been asked to solve this problem in one fail swoop, it would certainly have been much more challenging to do. As a result your implementation would have most likely been much less extensible, maintainable, and verifiable (e.g. the ability to test small chunks of functionality). Please keep this in mind as you continue using Python in the future!
Criteria 4: Uses Python Language Features
Score Level: 4/4
Comment(s): Very nice job taking advantage of many built-in language functions and features! Since there is always room for improvement, I have shared a few recommendations and feedback on your implementation below.
Great job using triple quote strings! They offer more safety and flexibility in that they allow for multi-line strings.
You could shorten your prepare_text method some by taking advantage of the join method like this.
def prepare_text(text):
lower_text = text.lower()
punctuation = ".,!?"
no_punctuation = ''.join(letter for letter in lower_text if letter not in punctuation)
return no_punctuation.split()
Your build_frequency_table seemed quite overly complex with all the looping going on. I think you may have just over thought this one, but consider this more straightforward implementation.
def build_frequency_table(corpus):
word_count_frequency = {}
for word in corpus:
if word_count_frequency.get(word):
word_count_frequency[word] += 1
else:
word_count_frequency[word] = 1
return word_count_frequency
We just iterate through the words and either add the unique word to our collection or increment the count.
Criteria 5: Produces Accurate Output
Score Level: 4/4
Comment(s): Great job correctly identifying the murderer! Despite some of the minor issues above, the overall impact to your code was marginal.
Overall Score: 19/20
Overall you did a fantastic job on this project! I can tell that you have learned a lot from the PWP course and hope you continue on with your Python journey.
I appreciate that feedback. Seems I quickly fall back into old habits and don't use the full capability of Python, so you code suggestions are really helpful.
Rubric Score
Criteria 1: Valid Python Code
Criteria 2: Implementation of Project Requirements
get_average_sentence_length
It looks like you are a bit off on the results that I used to test this function out. I wasn't able to figure out the reason for the difference but I am guessing it has something to do with the 2nd to last line of code in your implementation. You are correct that we end up with something extra because of the period, but I think that impacts the number of sentences instead of the number of words. Anyway, here is the code I used for the test output along with the results I get with my implementation.
Also, here is how I dealt with the extra sentence issue (e.g. using
filter
).frequency_comparison
When you are checking your 2nd table, don't forget that we need to use the value of that key for our number of appearances.
It looks like you are only looking at the key. Here is the test code that I used with the correct results for you to compare with later.
Also, I thought I would provide my implementation as well for learning purposes.
percent_difference
I would recommend implementing this function assuming that the passed in values for just numbers, and then access your average sentence length from each TextSample when you call the function. Obviously either approach produces the same result, but in general its better to implement functions that are more generic and don't require knowledge of what is being passed in.
Here is an example of what I mean.
Everything else from a project requirements perspective looks great! Be sure to checkout the below review section (Criteria 4) for some recommendations and feedback on how you might further improve your Python skills
Criteria 3: Software Architecture
Criteria 4: Uses Python Language Features
Great job using triple quote strings! They offer more safety and flexibility in that they allow for multi-line strings.
You could shorten your
prepare_text
method some by taking advantage of thejoin
method like this.build_frequency_table
seemed quite overly complex with all the looping going on. I think you may have just over thought this one, but consider this more straightforward implementation.We just iterate through the words and either add the unique word to our collection or increment the count.
Criteria 5: Produces Accurate Output
Overall Score: 19/20
Overall you did a fantastic job on this project! I can tell that you have learned a lot from the PWP course and hope you continue on with your Python journey.
Keep up the great work!