Comment(s): Your code runs without throwing any errors. Well done!
Criteria 2: Implementation of Project Requirements
Score Level: 4/4 (Exceeds Expectations)
Comment(s): You implemented all of the necessary functions and the TextSample class. Good job!
Criteria 3: Software Architecture
Score Level: 4/4 (Exceeds Expectations)
Comment(s): Your code is grouped into functions appropriately.
Criteria 4: Uses Python Language Features
Score Level: 4/4 (Exceeds Expectations)
Comment(s): You consistently use Python language features where appropriate. Nice work!
Criteria 5: Produces Accurate Output
Score Level: 2/4 (Approaches Expectations)
Comment(s): There are issues with the frequency_comparison() function that result in the output being off. Let's take a look at a portion of this function:
for key1 in table1.keys():
for key2 in table2.keys():
if key1 == key2:
if table1[key1] < table2[key2]:
mutual_appearances += table1[key1]
appearances += table2[key2]
else:
mutual_appearances += table2[key2]
appearances += table1[key1]
elif key1 not in table2:
appearances += table1[key1]
Consider what happens when key1 is only in table1 and not in table2. We will go through every key that is in table two, and for every one of these keys, we check if key1 == key2, and (since key1 != key2 in this case) then add table1[key1] to the appearances count. Thus, for each given key1 that is not in table2, we are adding table1[key1] to appearances many times (as many times as there are keys in table2, more precisely). However, for a key1 that appears in both table1 and table2, mutual_appearances and appearances only have table1[key1] or table2[key2] added to them once. As a result, keys that appear in one table but not the other are weighted much more heavily than keys that appear in both, throwing off the results. To fix this, I would recommend getting rid of the inner loop (in the 'for key2 in table2.keys():' from the code above) and instead just checking if each key1 from table is in table2, and then using a similar set of if/else statements to what you used above (note that if key1 is in table1 and table2, then you can retrieve the number of appearances of key1 in each table with 'table1[key1]' and 'table2[key1]'). You will have to make similar modifications to the part of this function below the part pasted above as well (which loops through table2). Also, in the get_average_sentence_length() function, I would recommend not rounding avg_words, as the extra precision could be helpful in determining differences between the various samples.
Overall Score: 18/20 (Meets Expectations)
Well done! Most of your functions do exactly what they are supposed to. However, as noted above, there are problems with the frequency_comparison() function, and I would encourage you to take the time to make sure you understand why the existing code does not work properly and how it could be fixed. Also, good work testing much of your code (I noted the various commented out test lines still in there). I would encourage you to make your testing process even more thorough, though, as writing additional test cases could help you identify errors such as those in the frequency_comparison() function. Also, it can be useful to walk through each line of your functions by hand to think about how they would handle different inputs, as this can help you identify issues with the logic of the functions. Again, nice work with this project overall!
Thanks Karl for your quick feedback. Not surprised the frequency_comparison() was off. Just couldn’t think it through. Will note corrections and advice.
Rubric Score
Criteria 1: Valid Python Code
Criteria 2: Implementation of Project Requirements
Criteria 3: Software Architecture
Criteria 4: Uses Python Language Features
Criteria 5: Produces Accurate Output
Consider what happens when key1 is only in table1 and not in table2. We will go through every key that is in table two, and for every one of these keys, we check if key1 == key2, and (since key1 != key2 in this case) then add table1[key1] to the appearances count. Thus, for each given key1 that is not in table2, we are adding table1[key1] to appearances many times (as many times as there are keys in table2, more precisely). However, for a key1 that appears in both table1 and table2, mutual_appearances and appearances only have table1[key1] or table2[key2] added to them once. As a result, keys that appear in one table but not the other are weighted much more heavily than keys that appear in both, throwing off the results. To fix this, I would recommend getting rid of the inner loop (in the 'for key2 in table2.keys():' from the code above) and instead just checking if each key1 from table is in table2, and then using a similar set of if/else statements to what you used above (note that if key1 is in table1 and table2, then you can retrieve the number of appearances of key1 in each table with 'table1[key1]' and 'table2[key1]'). You will have to make similar modifications to the part of this function below the part pasted above as well (which loops through table2). Also, in the get_average_sentence_length() function, I would recommend not rounding avg_words, as the extra precision could be helpful in determining differences between the various samples.
Overall Score: 18/20 (Meets Expectations)
Well done! Most of your functions do exactly what they are supposed to. However, as noted above, there are problems with the frequency_comparison() function, and I would encourage you to take the time to make sure you understand why the existing code does not work properly and how it could be fixed. Also, good work testing much of your code (I noted the various commented out test lines still in there). I would encourage you to make your testing process even more thorough, though, as writing additional test cases could help you identify errors such as those in the frequency_comparison() function. Also, it can be useful to walk through each line of your functions by hand to think about how they would handle different inputs, as this can help you identify issues with the logic of the functions. Again, nice work with this project overall!