mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
688 stars 125 forks source link

Clean import statements #567

Closed stweil closed 5 months ago

stweil commented 5 months ago

The import statements were sorted with isort . and unused import statements were removed with fix8.

stweil commented 5 months ago

@mittagessen, I am not sure whether you want this kind of code changes, but I think that OCR-D will be forced to use a code style in the future and eScriptorium already enforces PEP 8. Therefore it might be a good idea to use the same tools which are used by eScriptorium – isort and flake8 for kraken, too.

After this pull request was applied, flake8 still reports 190 PEP 8 issues (grouped and sorted by issue number):

   6 E111 indentation is not a multiple of 4
   4 E114 indentation is not a multiple of 4 (comment)
   1 E117 over-indented
   8 E122 continuation line missing indentation or outdented
   1 E128 continuation line under-indented for visual indent
   2 E203 whitespace before ':'
   1 E222 multiple spaces after operator
  32 E231 missing whitespace after ','
   8 E231 missing whitespace after ':'
   1 E261 at least two spaces before inline comment
  74 E265 block comment should start with '# '
   2 E275 missing whitespace after keyword
   9 E302 expected 2 blank lines, found 1
   2 E305 expected 2 blank lines after class or function definition, found 1
   2 E501 line too long (163 > 160 characters)
   1 E501 line too long (164 > 160 characters)
   2 E501 line too long (166 > 160 characters)
   5 E501 line too long (167 > 160 characters)
   3 E501 line too long (169 > 160 characters)
   4 E501 line too long (170 > 160 characters)
   5 E501 line too long (171 > 160 characters)
   1 E501 line too long (172 > 160 characters)
   1 E501 line too long (181 > 160 characters)
   1 E501 line too long (225 > 160 characters)
   1 E501 line too long (274 > 160 characters)
   1 E501 line too long (344 > 160 characters)
   2 E712 comparison to True should be 'if cond is not True:' or 'if not cond:'
   1 E713 test for membership should be 'not in'
   1 E714 test for object identity should be 'is not'
   1 F841 local variable 'accuracy' is assigned to but never used
   1 F841 local variable 'data_module' is assigned to but never used
   3 F841 local variable 'e' is assigned to but never used
   1 F841 local variable 'is_verbose' is assigned to but never used
   1 F841 local variable 'metrics' is assigned to but never used
   1 W291 trailing whitespace

Should those be fixed? Running autopep8 -air . reduces the number of issues to 32.

mittagessen commented 5 months ago

After this pull request was applied, flake8 still reports 190 PEP 8 issues (grouped and sorted by issue number)

The vast majority of those are in the contrib/print_word_spreader.py script that is a bit ugly (but I don't really want to touch it). And I'm only getting 133 warnings with all but two being line too long warnings that aren't easily reformatted (mostly in lib/vgsl.py. I suspect you're linting on something else as variables like is_verbose do not occur in the code base.

stweil commented 5 months ago

After your latest linting changes, I still see 82 reports:

   8 E122 continuation line missing indentation or outdented
  47 E265 block comment should start with '# '
   2 E501 line too long (163 > 160 characters)
   1 E501 line too long (164 > 160 characters)
   1 E501 line too long (166 > 160 characters)
   4 E501 line too long (167 > 160 characters)
   3 E501 line too long (169 > 160 characters)
   3 E501 line too long (170 > 160 characters)
   5 E501 line too long (171 > 160 characters)
   1 E501 line too long (172 > 160 characters)
   1 E501 line too long (193 > 160 characters)
   1 E501 line too long (265 > 160 characters)
   1 E714 test for object identity should be 'is not'
   1 F841 local variable 'data_module' is assigned to but never used
   3 F841 local variable 'e' is assigned to but never used

My previous test included a local Python script mlmodel.py, the script which I wrote to fix model files. That was wrong, because it is not part of the kraken code.

stweil commented 5 months ago

autopep8 -ri . reduces the number of PEP 8 issues from 82 to 26. With autopep8 -aari . there remain only 12 of them.

mittagessen commented 5 months ago

You're linting auto-generated sphinx configuration files as well. I've touched the toxic waste dump that is contrib/print_word_spreader.py and NOQAed the lines that are too long in the actual source code and flake8 runs cleanly.

BTW autopep8 produces unreadable code from long lines.

stweil commented 5 months ago

No, I cleaned the directory before running flake8. All remaining 55 issues are in docs/conf.py.

And yes, autopep8 with too many -a parameters produces code which I also would not want in my projects.