Open katherineanthony opened 4 months ago
HI, I can code review this one!
Proposed Enhancements
Case Sensitivity:
Implement functionality to treat words with different cases as the same (e.g., "Blah" and "blah" should be considered identical, with only "blah" being outputted).
Ignoring Trailing and Leading Spaces:
Add functionality to ignore leading and trailing spaces (e.g., " blah" and "blah" should be treated as the same, with only "blah" being outputted).
Test Input and Observations
Using the input below with the -w flag, a user trap error is encountered, suggesting the program does not correctly parse blank spaces:
blah blah blah
test
test
test
test
Blah
blah
Inefficient and Potentially Problematic Sorting Algorithm
The use of bubble sort may be inefficient for large datasets, and its implementation does not account for potential issues with duplicate or very similar strings, which could lead to an incorrect sort order in edge cases.
Identified Issues
Memory Management:
In the lines() and words() functions, memory is allocated but not freed, potentially leading to memory leaks.
Error Handling in File Operations:
There is no explicit error handling after file operations like open or close. Failure to open files or read/write errors could lead to undefined behavior or crashes.
Logic in print_uniq():
The comparison logic assumes the existence of lines[curr + 1] without verifying that curr + 1 < count. This could lead to accessing out-of-bounds memory, resulting in undefined behavior or crashes.
Increased Size of File Input Text:
"I cannot tell if it is just my machine, but when I double the lines of README.md, trying to read more lines or words, the program seems to enter an infinite loop and never finds the EOF. Also, when adding spaces to certain lines and words, execution tends to lead to undefined behavior."
Example Test Input That
test file
test
testing this file
test file
test file
blah blah blah
blah
hello
joe
bat
okay
what
are
we
doing
here
again
Overall
"I had a lot of fun playing with the code and program. Thanks for allowing me to peer-review the code. Also, please let me know if I misinterpreted or misjudged something from the program's implementation."
Hi, Hana and I ( Andrew ) will code review this project!
Hi, great job on your project! A few suggestions we (Andrew and Hana) would suggest are:
Memory Management: In the lines
function, memory allocated with malloc
for line
is immediately freed after being added to the lines
array. Instead, I suggest you free the memory only after you're done using the allocated strings, which should be after print_uniq
has processed them.
Efficiency Concerns: The use of bubble sort for sorting lines or words is not the most efficient choice, especially for larger datasets. Consider implementing or using a more efficient sorting algorithm, like quicksort or mergesort, to improve performance.
Logic Error: The loop in print_uniq
does not correctly handle the last unique word or line since it relies on comparing the current item with the next one, which doesn't work for the last element.
Incorrect Continuation in Loop in print_uniq
: The else if
condition checking for "\0"
and "\n"
uses continue
without incrementing curr
, leading to an infinite loop.
Overall, I enjoyed using the command and thought you did a great job.
@malensek Vincent and I have completed these code corrections. Thank you!
Vinnie and I completed the implementation for uniq as follows: