Closed jendrikseipp closed 6 years ago
Original comment by Jendrik Seipp (Bitbucket: jendrikseipp, GitHub: jendrikseipp):
No worries, we're not in a hurry :)
Original comment by Florian Bruhin (Bitbucket: The-Compiler, GitHub: The-Compiler):
Sorry for not coming back to this earlier - it seems a sip version with my patch to have a virtual
attribute in the XML was released in the meantime, so I intend to pick this up again at some point, but I have a bit much on my plate right now.
Original comment by Jendrik Seipp (Bitbucket: jendrikseipp, GitHub: jendrikseipp):
I'd still prefer not to have the generated file in the repository. It is better to generate a version for a specific qt release. Also, anybody, who wants to make sure that they're using the latest version, will use the generator anyway.
Original comment by Florian Bruhin (Bitbucket: The-Compiler, GitHub: The-Compiler):
The file only would need updating when there's a new Qt release, which is about once all 6 months.
If we only include the generator with vulture, every downstream project would need to generate/commit/update their own whitelist file - IMHO it's clearly preferable to have this in a central location?
Original comment by Jendrik Seipp (Bitbucket: jendrikseipp, GitHub: jendrikseipp):
Nice work! Wouldn't it be better if we could specify the qt version on the command line? A function might be dead code in one version, but not in a different one. Can we find the list of FEATURES automatically?
I think it's best not to add generated files to repositories. Having only the generator is better, since then we don't have to update the generated file.
It would be great if you could prepare a pull request! I would propose to add the file under whitelists/pyqt/generator.py
.
Original comment by Florian Bruhin (Bitbucket: The-Compiler, GitHub: The-Compiler):
This is all surprisingly easy \o/
This is the script I have currently:
import os [11/5194]
import argparse
import itertools
import subprocess
import tempfile
from lxml import etree
TIMELINE = ['Qt_5_0_0', 'Qt_5_0_1', 'Qt_5_0_2', 'Qt_5_1_0', 'Qt_5_1_1',
'Qt_5_2_0', 'Qt_5_2_1', 'Qt_5_3_0', 'Qt_5_3_1', 'Qt_5_3_2',
'Qt_5_4_0', 'Qt_5_4_1', 'Qt_5_4_2', 'Qt_5_5_0']
PLATFORMS = ['WS_X11', 'WS_WIN', 'WS_MACX']
FEATURES =['PyQt_Accessibility', 'PyQt_SessionManager', 'PyQt_SSL',
'PyQt_qreal_double', 'Py_v3', 'PyQt_PrintDialog', 'PyQt_Printer',
'PyQt_PrintPreviewWidget', 'PyQt_PrintPreviewDialog',
'PyQt_RawFont', 'PyQt_OpenGL', 'PyQt_Desktop_OpenGL',
'PyQt_NotBootstrapped']
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--sip', help='The sip binary to use', default='sip',
nargs=1)
return parser.parse_args()
def run_sip(module, outdir, sip_executable):
for exclusive_tags in itertools.product(TIMELINE, PLATFORMS):
filename = '{}-{}.xml'.format(module, '-'.join(exclusive_tags))
outfile = os.path.join(outdir, filename)
cmdline = [sip_executable, '-m', outfile]
for tag in list(exclusive_tags) + FEATURES:
cmdline += ['-t', tag]
cmdline.append(
os.path.join(module, '{}mod.sip'.format(module)))
print(' {} -> {}'.format(', '.join(exclusive_tags), outfile))
subprocess.call(cmdline)
def parse_xmls(xmldir):
for basename in os.listdir(xmldir):
xmlfile = os.path.join(xmldir, basename)
with open(xmlfile, 'r') as f:
tree = etree.parse(f)
yield from tree.xpath('/Module/Class/Function[@virtual="1"]/@name')
def get_modules():
for filename in os.listdir('.'):
filepath = os.path.abspath(filename)
if os.path.isdir(filepath):
yield filename
def main():
args = parse_args()
with open('whitelist.py', 'w') as outfile:
for module in get_modules():
with tempfile.TemporaryDirectory() as tmpdir:
outfile.write('# {}\n'.format(module))
name_set = set()
print("Running sip for {}...".format(module))
run_sip(module, tmpdir, args.sip[0])
print("Parsing and merging XML files...")
print()
for name in parse_xmls(tmpdir):
name_set.add(name)
for name in sorted(name_set):
outfile.write('{}.{}\n'.format(module, name))
outfile.write('\n')
outfile.flush()
if __name__ == '__main__':
main()
After running for 1.5 hours (because it runs sip for all combinations of 14 Qt versions and 3 platforms) it spits out a file with around 2800 whitelist entries.
I'll still have to actually test it and clean everything up - after that, is it okay if I open a PR adding the whitelist and my script to the repo?
Original comment by Florian Bruhin (Bitbucket: The-Compiler, GitHub: The-Compiler):
I couldn't resist to take a quick look, but it looks easier than I thought!
sip has some (beta) XML export feature - I now submitted a patch so that XML includes a virtual="(1|0)"
attribute for Function
tag.
So the only piece left is a small python script which parses that XML and outputs all functions from there :)
Original comment by Jendrik Seipp (Bitbucket: jendrikseipp, GitHub: jendrikseipp):
Sounds interesting, please keep me posted.
Original comment by Florian Bruhin (Bitbucket: The-Compiler, GitHub: The-Compiler):
Overridden methods not being executed is a good point - with a whitelist, we could probably at least restrict that to virtual methods.
PyQt bindings code is generated from sip files, which is a special format - I wonder how hard it'd be to write a Python parser for those (possibly using the existing flex/bison grammar), and then generate a whitelist from that.
I'll take a look at it, but it might take me some weeks until I get time to do so.
Original comment by Jendrik Seipp (Bitbucket: jendrikseipp, GitHub: jendrikseipp):
Even if a C++ method is overriden, it might never get executed. Also, I don't think we can find out whether a method overrides a C++ method without executing and inspecting the code.
Original comment by Florian Bruhin (Bitbucket: The-Compiler, GitHub: The-Compiler):
PyQt5 has hundreds of classes, each with dozens of methods - and with every Qt release there will be new ones, so maintaining this will be a big effort, and the same issue will pop up for anyone who uses a C extension exposing classes...
Do you think there's an issue with the heuristic I presented above? I think it's safe to assume the programmer knows what they were doing if they're overriding an existing method of a C method. I also think something like that could be detected easily (trying if inspect.getargspec()
works on the class, for example).
Original comment by Jendrik Seipp (Bitbucket: jendrikseipp, GitHub: jendrikseipp):
I think the best solution to this would be to create a PyQt whitelist module for vulture. Do you want to start such an effort? I would recommend adding a file whitelists/pyqt.py and making the usages explicit in there.
Maybe we can detect which whitelist files to include automatically later (e.g. include whitelist/pyqt.py if at least one pyqt module is imported).
I conducted a little experiment with the qutebrowser's run_vulture.py
:
I changed the run
function to disable filter_func
(which previously whitelisted all the objects whose name was written in CamelCase) and added a vulture_whitelist
(obtained by running the above script for PyQt-5.10.1). Here is how the function looked like:
def run(files):
"""Run vulture over the given files."""
with tempfile.NamedTemporaryFile(mode='w', delete=False) as whitelist_file:
for line in whitelist_generator():
whitelist_file.write(line + '\n')
whitelist_file.close()
vult = vulture.Vulture(verbose=False)
vult.scavenge(files + [whitelist_file.name, 'vulture_whitelist.py'])
os.remove(whitelist_file.name)
vulture_attrs = ['unused_funcs', 'unused_props', 'unused_vars', 'unused_attrs']
items = []
for attr in vulture_attrs:
sub_items = getattr(vult, attr)
items.extend(sub_items)
return report(items)
After this, when I run vulture -e tox
, the output was:
vulture installed: attrs==17.4.0,colorama==0.3.9,cssutils==1.0.2,Jinja2==2.10,MarkupSafe==1.0,Pygments==2.2.0,pyPEG2==2.15.2,PyQt5==5.10.1,PyYAML==3.12,qutebrowser==1.2.1,sip==4.19.8,vulture==0.26
vulture runtests: PYTHONHASHSEED='3339987915'
vulture runtests: commands[0] | /Users/rahuljha/Documents/qutebrowser/.tox/vulture/bin/python scripts/dev/run_vulture.py
qutebrowser/browser/network/pac.py:102: unused function 'dnsResolve' (60% confidence)
qutebrowser/browser/network/pac.py:120: unused function 'myIpAddress' (60% confidence)
qutebrowser/browser/webengine/interceptor.py:42: unused function 'interceptRequest' (60% confidence)
qutebrowser/browser/webengine/webenginequtescheme.py:38: unused function 'requestStarted' (60% confidence)
qutebrowser/resources.py:5471: unused function 'qCleanupResources' (60% confidence)
scripts/dev/run_vulture.py:129: unused function 'filter_func' (60% confidence)
ERROR: InvocationError: '/Users/rahuljha/Documents/qutebrowser/.tox/vulture/bin/python scripts/dev/run_vulture.py'
______________________________________________ summary ______________________________________________
ERROR: vulture: commands failed
I couldn't find dnsResolve
, myIpAddress
, qCleanupResources
in any of the sip files, so I guess those are just false positives.
But, a simple grep shows that interceptRequest
and requestStarted
indeed are virtual functions which our script isn't catching.
sip/QtWebEngineCore/qwebengineurlrequestinterceptor.sip:33: virtual void interceptRequest(QWebEngineUrlRequestInfo &info) = 0;
sip/QtWebEngineCore/qwebengineurlschemehandler.sip:34: virtual void requestStarted(QWebEngineUrlRequestJob *) = 0;
Isn't that an anomaly in the script?
I think you mean tox -e vulture
?
Maybe the two virtual functions are not explicitly marked as virtual (I don't know how sip files work)?
@The-Compiler, can you help us out here?
I agree about the false-positives. dnsResolve
and myIpAddress
are exposed to JavaScript via the @_js_slot
decorator and vulture can't catch that. Dunno about qCleanupResources
, that's some internal PyQt-thing probably, which is called from C++ but without being actually something from Qt.
I agree about interceptRequest
and requestStarted
- those should be virtual. I'm not sure whether it's a bug in the script, or a bug in sip
somewhere - you can probably tell by reviewing the XML file which gets output by sip.
I think you mean tox -e vulture?
So sorry, my bad! :-p
I agree about interceptRequest and requestStarted - those should be virtual. I'm not sure whether it's a bug in the script, or a bug in sip somewhere - you can probably tell by reviewing the XML file which gets output by sip.
I am looking into it.
What's the status here? Can we link to your repo, Rahul and close this issue?
Hey Jendrik,
Ongoing action occurring at https://github.com/RJ722/vulture-whitelist-generators/tree/generate. I've an ongoing PR which is just one final review from @The-Compiler away from alpha release 1. 😛
Now that we have a repository for whitelist generators at https://github.com/RJ722/vulture-whitelist-generators, I think we can close this issue.
Originally reported by: Florian Bruhin (Bitbucket: The-Compiler, GitHub: The-Compiler)
When subclassing a method of a class defined in a C/C++ extension, such as this example using the PyQt GUI framework:
vulture doesn't detect that
sizeHint
is called via C++:Maybe if a class inherits from a non-Python class, and an existing method is overridden, it should always be considered used?