sudharakab / pdfsizeopt

Automatically exported from code.google.com/p/pdfsizeopt
0 stars 0 forks source link

Static analysis warnings #80

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Dear Peter,

I am trying to read the source code of pdfsizeopt, but it is proving to be a 
little harder than what I thought at first.

The code doesn't follow many of the conventions of writing python programs (in 
particular, the PEP-8).

There is a program called pep8 that checks for those (available in Debian, 
Ubuntu and others). I just run:

    pep8 pdfsizeopt.py | cut -d : -f 4 | sort | uniq -c

and here is the result of the command:

     16  E101 indentation contains mixed spaces and tabs
   2155  E111 indentation is not a multiple of four
     76  E203 whitespace before '
      1  E203 whitespace before ','
      1  E221 multiple spaces before operator
      1  E222 multiple spaces after operator
      2  E225 missing whitespace around operator
      5  E261 at least two spaces before inline comment
      2  E262 inline comment should start with '# '
      1  E271 multiple spaces after keyword
      4  E301 expected 1 blank line, found 0
      6  E302 expected 2 blank lines, found 1
      2  E303 too many blank lines (2)
      1  E501 line too long (109 characters)
      1  E501 line too long (112 characters)
      1  E501 line too long (114 characters)
      1  E501 line too long (115 characters)
      2  E501 line too long (134 characters)
      2  E501 line too long (139 characters)
      1  E501 line too long (150 characters)
      1  E501 line too long (178 characters)
      1  E501 line too long (206 characters)
      1  E501 line too long (389 characters)
      1  E501 line too long (7282 characters)
     13  E501 line too long (80 characters)
      7  E501 line too long (81 characters)
      2  E501 line too long (82 characters)
      4  E501 line too long (83 characters)
      1  E501 line too long (84 characters)
      3  E501 line too long (86 characters)
      2  E501 line too long (88 characters)
      1  E501 line too long (89 characters)
      1  E501 line too long (93 characters)
      1  E501 line too long (97 characters)
      1  E501 line too long (99 characters)
     23  E701 multiple statements on one line (colon)
     16  W191 indentation contains tabs
      3  W293 blank line contains whitespace
      1  W602 deprecated form of raising exception

I understand that some of those you may not want to fix, depending on your 
plans of backwards compatibility with older versions of Python, though.

Which versions of Python would you like to support?

I may send you a patch or two, depending on how you answer.

Thanks for this great program, BTW,

Rogério.

Original issue reported on code.google.com by rbr...@gmail.com on 27 Feb 2013 at 8:36

GoogleCodeExporter commented 9 years ago
Just as an extra input, here is the output of pyflakes when run with pdfsizeopt:

pdfsizeopt.py:2639: local variable 'i0' is assigned to but never used
pdfsizeopt.py:2670: undefined name 'self'
pdfsizeopt.py:3269: local variable 'number_re' is assigned to but never used
pdfsizeopt.py:3270: local variable 'width' is assigned to but never used
pdfsizeopt.py:3271: local variable 'height' is assigned to but never used
pdfsizeopt.py:3390: local variable 'ihdr' is assigned to but never used
pdfsizeopt.py:3981: undefined name 'xref_ofs'
pdfsizeopt.py:4893: local variable 'stat' is assigned to but never used
pdfsizeopt.py:5131: local variable 'stat' is assigned to but never used
pdfsizeopt.py:5625: local variable 'stat' is assigned to but never used
pdfsizeopt.py:5687: local variable 'sourcefnq' is assigned to but never used
pdfsizeopt.py:5688: local variable 'targetfnq' is assigned to but never used
pdfsizeopt.py:5917: local variable 'stat' is assigned to but never used
pdfsizeopt.py:5920: undefined name 'pdf_tmp_file_name'
pdfsizeopt.py:6475: local variable 'has_ne' is assigned to but never used
pdfsizeopt.py:6812: local variable 'trailer_size' is assigned to but never used
pdfsizeopt.py:7080: local variable 'total_padding_size' is assigned to but 
never used
pdfsizeopt.py:7266: undefined name 'pdf_objs'
pdfsizeopt.py:7315: undefined name 'trailer_obj'
pdfsizeopt.py:7320: undefined name 'trailer_obj'
pdfsizeopt.py:7333: undefined name 'trailer_obj'
pdfsizeopt.py:7338: undefined name 'trailer_obj'
pdfsizeopt.py:7339: undefined name 'trailer_obj'
pdfsizeopt.py:7341: undefined name 'trailer_obj'
pdfsizeopt.py:7357: undefined name 'trailer_obj'
pdfsizeopt.py:7358: undefined name 'trailer_obj'
pdfsizeopt.py:7359: undefined name 'trailer_obj'
pdfsizeopt.py:7360: undefined name 'trailer_obj'
pdfsizeopt.py:7361: undefined name 'trailer_obj'
pdfsizeopt.py:7362: undefined name 'trailer_obj'
pdfsizeopt.py:7363: undefined name 'trailer_obj'
pdfsizeopt.py:7364: undefined name 'trailer_obj'
pdfsizeopt.py:7496: local variable 'stat' is assigned to but never used

Original comment by rbr...@gmail.com on 27 Feb 2013 at 8:41

GoogleCodeExporter commented 9 years ago
The output of pychecker:

pychecker --limit 100 pdfsizeopt.py 
Processing module pdfsizeopt (pdfsizeopt.py)...

Warnings...

pdfsizeopt.py:141: Using a conditional statement with a constant value (0)
pdfsizeopt.py:658: Parameter (cls) not used
pdfsizeopt.py:660: Parameter (cls) not used
pdfsizeopt.py:670: Parameter (cls) not used
pdfsizeopt.py:834: Using a conditional statement with a constant value (-1)
pdfsizeopt.py:984: Parameter (cls) not used
pdfsizeopt.py:1004: Function (ParseSimpleValue) has too many returns (12)
pdfsizeopt.py:1401: Parameter (cls) not used
pdfsizeopt.py:1637: Parameter (cls) not used
pdfsizeopt.py:1735: Parameter (cls) not used
pdfsizeopt.py:1825: (filter) shadows builtin
pdfsizeopt.py:1849: Parameter (cls) not used
pdfsizeopt.py:1860: Parameter (cls) not used
pdfsizeopt.py:2316: (filter) shadows builtin
pdfsizeopt.py:2321: (filter) shadows builtin
pdfsizeopt.py:2565: Using a conditional statement with a constant value (0)
pdfsizeopt.py:2606: Using the return value from (output.append) which is always 
None
pdfsizeopt.py:2624: Using a conditional statement with a constant value (0)
pdfsizeopt.py:2626: Using a conditional statement with a constant value (0)
pdfsizeopt.py:2633: Parameter (cls) not used
pdfsizeopt.py:2639: Local variable (i0) not used
pdfsizeopt.py:2670: No global (self) found
pdfsizeopt.py:2673: Parameter (cls) not used
pdfsizeopt.py:2715: Using a conditional statement with a constant value (0)
pdfsizeopt.py:2780: Format string argument count (2) doesn't match arguments (1)
pdfsizeopt.py:2780: Invalid arguments to (len), got 2, expected 1
pdfsizeopt.py:3022: Using a conditional statement with a constant value (0)
pdfsizeopt.py:3103: Using a conditional statement with a constant value (0)
pdfsizeopt.py:3105: Using a conditional statement with a constant value (0)
pdfsizeopt.py:3224: Using a conditional statement with a constant value (0)
pdfsizeopt.py:3306: (filter) shadows builtin
pdfsizeopt.py:3348: Using a conditional statement with a constant value (0)
pdfsizeopt.py:3390: Local variable (ihdr) not used
pdfsizeopt.py:3450: Using a conditional statement with a constant value (0)
pdfsizeopt.py:3605: Function (ParseUsingXrefStream) has too many local 
variables (41)
pdfsizeopt.py:3946: Parameter (cls) not used
pdfsizeopt.py:3981: No global (xref_ofs) found
pdfsizeopt.py:4188: Parameter (cls) not used
pdfsizeopt.py:4346: (filter) shadows builtin
pdfsizeopt.py:4891: Using a conditional statement with a constant value (0)
pdfsizeopt.py:4893: Local variable (stat) not used
pdfsizeopt.py:4897: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5129: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5131: Local variable (stat) not used
pdfsizeopt.py:5135: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5192: Parameter (cls) not used
pdfsizeopt.py:5236: Parameter (cls) not used
pdfsizeopt.py:5321: Function (UnifyType1CFonts) has too many local variables 
(54)
pdfsizeopt.py:5517: Invalid arguments to (Set), got 1, expected between 2 and 3
pdfsizeopt.py:5623: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5625: Local variable (stat) not used
pdfsizeopt.py:5628: Format string argument count (0) doesn't match arguments (1)
pdfsizeopt.py:5629: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5679: Parameter (cls) not used
pdfsizeopt.py:5687: Local variable (sourcefnq) not used
pdfsizeopt.py:5688: Local variable (targetfnq) not used
pdfsizeopt.py:5701: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5907: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5917: Local variable (stat) not used
pdfsizeopt.py:5920: No global (pdf_tmp_file_name) found
pdfsizeopt.py:5921: Using a conditional statement with a constant value (0)
pdfsizeopt.py:5928: Function (OptimizeImages) has too many lines (436)
pdfsizeopt.py:5928: Function (OptimizeImages) has too many local variables (58)
pdfsizeopt.py:5953: (filter) shadows builtin
pdfsizeopt.py:6364: INTERNAL ERROR -- STOPPED PROCESSING FUNCTION --
    Traceback (most recent call last):
      File "/usr/lib/python2.7/dist-packages/pychecker/warn.py", line 242, in _checkFunction
        _checkCode(code, codeSource)
      File "/usr/lib/python2.7/dist-packages/pychecker/warn.py", line 155, in _checkCode
        raise NotImplementedError('No DISPATCH member for op %r' % op)
    NotImplementedError: No DISPATCH member for op 50

pdfsizeopt.py:6475: Local variable (has_ne) not used
pdfsizeopt.py:6532: Object (descs) has no attribute (sort)
pdfsizeopt.py:6569: Parameter (cls) not used
pdfsizeopt.py:6589: (filter) shadows builtin
pdfsizeopt.py:6630: Parameter (offsets_out) not used
pdfsizeopt.py:6678: Local variable (trailer_ofs) not used
pdfsizeopt.py:6695: Local variable (obj_starts) not used
pdfsizeopt.py:6696: Local variable (length_objs) not used
pdfsizeopt.py:6716: INTERNAL ERROR -- STOPPED PROCESSING FUNCTION --
    Traceback (most recent call last):
      File "/usr/lib/python2.7/dist-packages/pychecker/warn.py", line 242, in _checkFunction
        _checkCode(code, codeSource)
      File "/usr/lib/python2.7/dist-packages/pychecker/warn.py", line 155, in _checkCode
        raise NotImplementedError('No DISPATCH member for op %r' % op)
    NotImplementedError: No DISPATCH member for op 50

pdfsizeopt.py:6812: Local variable (trailer_size) not used
pdfsizeopt.py:6858: Parameter (cls) not used
pdfsizeopt.py:6986: Parameter (cls) not used
pdfsizeopt.py:7004: Function (FixPdfFromMultivalent) has too many lines (249)
pdfsizeopt.py:7004: Function (FixPdfFromMultivalent) has too many local 
variables (56)
pdfsizeopt.py:7080: Local variable (total_padding_size) not used
pdfsizeopt.py:7253: INTERNAL ERROR -- STOPPED PROCESSING FUNCTION --
    Traceback (most recent call last):
      File "/usr/lib/python2.7/dist-packages/pychecker/warn.py", line 242, in _checkFunction
        _checkCode(code, codeSource)
      File "/usr/lib/python2.7/dist-packages/pychecker/warn.py", line 155, in _checkCode
        raise NotImplementedError('No DISPATCH member for op %r' % op)
    NotImplementedError: No DISPATCH member for op 50

pdfsizeopt.py:7463: Using a conditional statement with a constant value (0)
pdfsizeopt.py:7494: Using a conditional statement with a constant value (0)
pdfsizeopt.py:7496: Local variable (stat) not used
pdfsizeopt.py:7500: Using a conditional statement with a constant value (0)
pdfsizeopt.py:7588: Invalid arguments to (FixPdfFromMultivalent), got 1, 
expected between 2 and 5
pdfsizeopt.py:7665: Function (main) has too many lines (215)
pdfsizeopt.py:7796: Using a conditional statement with a constant value (0)

Original comment by rbr...@gmail.com on 27 Feb 2013 at 8:43

GoogleCodeExporter commented 9 years ago
Just for the record, there is a tool that I found called PythonTidy (see: 
<http://lacusveris.com/PythonTidy/>) with which I have successfully 
indented/fixed some bad Python code of other people in an automated way 
(actually, semi-automated, as I didn't like 100% of the changes that it 
performed).

Original comment by rbr...@gmail.com on 27 Feb 2013 at 8:47

GoogleCodeExporter commented 9 years ago
Thank you for mentioning these tools. I haven't been using any of those, so I 
learned a lot from your posts.

I haven't made my final decision yet. I'll update the issue once it's done.

Based on your post, in r225 I've fixed all the pep8 warnings in pdfsizeopt.py, 
except for the indentation, which I'm not planning to redo from 2 to 4 spaces. 
From now on I consider it an issue to be fixed if the following command prints 
any warnings:

  pep8 pdfsizeopt.py | grep -vE ': (E111|E203|E122|E123|E124|E125) '  

Original comment by pts...@gmail.com on 27 Feb 2013 at 9:41

GoogleCodeExporter commented 9 years ago
Thanks, Peter.

Just summarizing, here are the tools that automate/preform some static analysis 
of the code (or help with other ways):

* pep8
* pychecker (this one tries to import the program as a module to see if there 
are issues when the python interpreter analyzes the code)
* pyflakes
* pylint (this one is specially picky)
* PythonTidy
* Python coverage

By the way, the legibility of the code improved quite a bit.

Thanks for everything.

Original comment by rbr...@gmail.com on 27 Feb 2013 at 10:02

GoogleCodeExporter commented 9 years ago
Fixed all pyflakes warnings in r229.

From now on I consider it an issue to be fixed if any of the following commands 
prints anything:

  pep8 pdfsizeopt.py | grep -vE ': (E111|E203|E122|E123|E124|E125) ' 
  pyflakes pdfsizeopt.py

The only fals positive I've encountered with pyflakes:

  x = 5
  if ...:
    del x:
  else:
    print x  # Reports: undefined name 'x'

I fixed it by replacing `del x' with `x = None'. I actually prefer `del x', 
because it causes a NameError if we want to use x later.

Thank you for summarizing the Python static analysis tools.

Original comment by pts...@gmail.com on 3 Mar 2013 at 8:31

GoogleCodeExporter commented 9 years ago
I'm running the latest pyflakes (0.6.1) with Python 2.6, and I'm not getting 
this warning at all:

pdfsizeopt.py:3269: local variable 'number_re' is assigned to but never used

Do you run an older version, or are you running it with some non-default 
configuration?

Original comment by pts...@gmail.com on 3 Mar 2013 at 8:34

GoogleCodeExporter commented 9 years ago
Fixed those pychecker warnings which made sense in r230. pychecker has caught 
some real bugs.

pychecker warns on `assert 0, ...'. This can be fixed by `assert False, ...'.

There are many other false positives, which can't be easily disabled, so I 
won't run pychecker regularly.

False positives:

$ pychecker --limit=9999 pdfsizeopt.py
Processing module pdfsizeopt (pdfsizeopt.py)...

Warnings...

pdfsizeopt.py:845: Using a conditional statement with a constant value (-1)
pdfsizeopt.py:5870: Local variable (sourcefnq) not used
pdfsizeopt.py:5871: Local variable (targetfnq) not used
pdfsizeopt.py:6565: INTERNAL ERROR -- STOPPED PROCESSING FUNCTION --
        Traceback (most recent call last):
          File "/usr/local/lib/python2.6/dist-packages/pychecker/warn.py", line 242, in _checkFunction
            _checkCode(code, codeSource)
          File "/usr/local/lib/python2.6/dist-packages/pychecker/warn.py", line 155, in _checkCode
            raise NotImplementedError('No DISPATCH member for op %r' % op)
        NotImplementedError: No DISPATCH member for op 50

pdfsizeopt.py:6830: Parameter (offsets_out) not used
pdfsizeopt.py:6878: Local variable (trailer_ofs) not used
pdfsizeopt.py:6895: Local variable (obj_starts) not used
pdfsizeopt.py:6896: Local variable (length_objs) not used
pdfsizeopt.py:6916: INTERNAL ERROR -- STOPPED PROCESSING FUNCTION --
        Traceback (most recent call last):
          File "/usr/local/lib/python2.6/dist-packages/pychecker/warn.py", line 242, in _checkFunction
            _checkCode(code, codeSource)
          File "/usr/local/lib/python2.6/dist-packages/pychecker/warn.py", line 155, in _checkCode
            raise NotImplementedError('No DISPATCH member for op %r' % op)
        NotImplementedError: No DISPATCH member for op 50

pdfsizeopt.py:7456: INTERNAL ERROR -- STOPPED PROCESSING FUNCTION --
        Traceback (most recent call last):
          File "/usr/local/lib/python2.6/dist-packages/pychecker/warn.py", line 242, in _checkFunction
            _checkCode(code, codeSource)
          File "/usr/local/lib/python2.6/dist-packages/pychecker/warn.py", line 155, in _checkCode
            raise NotImplementedError('No DISPATCH member for op %r' % op)
        NotImplementedError: No DISPATCH member for op 50

pdfsizeopt.py:7788: Invalid arguments to (FixPdfFromMultivalent), got 1, 
expected between 2 and 5

Original comment by pts...@gmail.com on 3 Mar 2013 at 9:35

GoogleCodeExporter commented 9 years ago
Thank you for suggesting PythonTidy. I won't be running it, because I'm happy 
with running pep8 and fixing issues manually.

I won't fix the indentation step: PEP 8 requires 4 spaces and I'm using 2 
spaces. This is on purpose, keeping it as is.

Original comment by pts...@gmail.com on 3 Mar 2013 at 9:39

GoogleCodeExporter commented 9 years ago
Indeed, pylint prints hundrends of warnings. It also looks like configurable. 
Maybe in the future I'll start writing a config file, but as of now I won't be 
running pylint.

Original comment by pts...@gmail.com on 3 Mar 2013 at 9:40