Closed thecurz closed 2 months ago
The changes introduced enhance the PythonManipulator
class in the fileManipulator.ts
file by adding methods for removing Python comments and docstrings. Specifically, two new methods, removeDocStrings
and removeHashComments
, are implemented to handle multi-line docstrings and single-line comments, respectively. The existing removeComments
method is modified to call these new methods. Additionally, the test cases in fileManipulator.test.ts
are expanded to cover various edge cases related to comment and string handling.
Files | Change Summary |
---|---|
src/core/file/fileManipulator.ts |
- Added: removeDocStrings(content: string): string - Added: removeHashComments(content: string): string - Modified: removeComments(content: string): string to include calls to the new methods. |
tests/core/file/fileManipulator.test.ts |
- Updated test case names for clarity. - Expanded test cases for comment and docstring removal, covering various edge cases including nested quotes and mixed comment types. |
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
[!TIP]
Early access features: enabled
We are currently testing new code review model(s) that may lead to higher noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note: - You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.
Thank you for this impressive work on improving our Python comment removal functionality. I really appreciate the effort you've put into this.
I'll take some time to review the changes in detail. Your work looks promising, and I'm looking forward to diving into it.
I've made a few commits to address some lint errors that came up. Please take a look at these changes when you have a moment.
HI, @thecurz ! Thank you for your excellent work on improving our Python comment removal functionality. I've reviewed the code, and it looks great.
Your attention to detail and the comprehensive test cases you've added are much appreciated.
I'm going to go ahead and merge this PR. Thanks again for your valuable contribution!
On a related note, I've noticed that our fileManipulator is growing quite large. We might consider splitting it into separate files in the future for better maintainability.
Summary
Altered files
src/core/file/fileManipulator.ts
: I added the methods removeDocStrings() and removeHashComments() to be used in removeComments() so that it's easier to implement an option to only delete certain types of comments later on.tests/core/file/fileManipulator.test.ts
: There are now 16 tests for Python comment removal.Observations
Certain edge cases were not considered to avoid complexity. For example in the code
only the first docstring is removed. But this is not standard practice in Python
Time Complexity
removeDocStrings
Function: *O(m n)** where 'm' is the number of lines and 'n' is the average length of each lineremoveHashComments
Function: *O((m n) + m log m)**Benchmark
For a better idea of the time complexity, I ran a quick test where the algorithm was applied to a repeated string of Python code containing several docstrings and comments. For 3 million lines, the algorithm ran in around one second on my computer. This is more likely than not a decent time since LLMs won't take this many tokens anyway.
Next steps
Summary by CodeRabbit
New Features
Bug Fixes
Tests