Closed bagerard closed 3 years ago
omg... sorry for that. I'm using pip to install it locally, not poetry so I had to clean those files temporarily... These manually triggered pipeline are really killing productivity, let's hope github finds another solution to address the cryptomining issues
Alright, now it should be good. I've set it up with poetry locally.
I added an additional piece of code to skip the # -*- coding: utf-8 -*-
line in file (so that it doesn't waste time processing the file if that is the only tokenize.COMMENT found in the file) and I added a test for that
Hi, please re-run the pipeline when you have a chance ;)
Done!
Merging #210 (5f6b875) into master (6a5aab0) will not change coverage. The diff coverage is
100.00%
.
@@ Coverage Diff @@
## master #210 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 1 1
Lines 41 42 +1
Branches 5 7 +2
=========================================
+ Hits 41 42 +1
Impacted Files | Coverage Δ | |
---|---|---|
flake8_eradicate.py | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 6a5aab0...5f6b875. Read the comment docs.
Note that consuming the file_tokens from flake8 made it slightly less trivial/efficient to ignore the file encoding '# -*- coding:'
comment so I ended up removing that optimization. It was not a major improvement and anyway the '# -*- coding:'
thing is something from the past so it may be better to avoid polluting the code with that (e.g pyupgrade removes them by default https://github.com/asottile/pyupgrade#-coding--comment)
Would you mind running the pipeline on it ?
Let me know if there is anything else I should do or if it's fine as it is
@bagerard I will return to this PR sometime soon! Please, stay tuned 🙂
I finally got the time to work on this, sorry for the long wait!
@bagerard thanks a lot for your work! 👍
Fixes #176
I looked into this and the issue was in the following block
self._tokens is actually containing the tokens from beginning of the file until the physical line yielded by flake8, this means that self._tokens is getting bigger and bigger every time flake8 invokes this plugin on the next lines, thus searching in self._tokens gets more and more expensive for large files as it's inefficient to search in large list by design (the large file is > 12000 lines so that makes a lot of tokens).
I switch the code to process the whole file instead of work line by line, that way it has to search for the comment token only once.
On the long python file attached in #176, this makes the processing time go from 30 sec to just 2 sec :rocket: