Closed rosafilgueira closed 3 years ago
We could use pydeps2requirements (I have to update it -locally -- to fix a small error ) for that ... (and possibly improve it)
Note: It uses the output of pydeps (the module dependency tool discuses in #26)
TEST 1: pydeps test/code_inspector/src/ --max-bacon=0 --show-raw-deps --nodot --noshow | python pydeps2requirements.py
astor # from: src.staticfg.model cdmcfparser # from: src.code_inspector click # from: src.code_inspector colorama # from: click._compat docstring_parser # from: src.code_inspector graphviz # from: src.staticfg.model
TEST 2: pydeps /Users/rosafilgueira/HW-Work/Research/somef/src/somef --max-bacon=0 --show-raw-deps --nodot --noshow | python pydeps2requirements.py
IPython # from: ipykernel.connect, ipykernel.displayhook, ipykernel.e... OpenSSL # from: urllib3.contrib.pyopenssl PIL # from: ipykernel.pylab.config, matplotlib.backend_bases, mat... PyQt5 # from: IPython.external.qt_loaders, PIL.ImageQt, pandas.io.c... _cffi_backend # from: cffi.api, cffi.commontypes, cffi.verifier _distutils_hack # from: setuptools _pytest # from: pytest _scandir # from: scandir appnope # from: ipykernel.eventloops asn1crypto # from: cryptography.hazmat.backends.openssl.backend, cryptog... atomicwrites # from: _pytest.assertion.rewrite attr # from: _pytest._code.code, _pytest.fixtures, pytest.main, ... backcall # from: IPython.core.events bcrypt # from: paramiko.ed25519key, paramiko.pkey botocore # from: pandas.io.s3 bs4 # from: pandas.io.html, somef.cli, somef.createExcerpts, soup... certifi # from: requests.certs cffi # from: PIL.Image, PIL.PyAccess, scipy._lib._ccallback, zmq.u... chardet # from: bs4.dammit, html5lib._inputstream, pygments.lexer, re... click # from: somef.main click_option_group # from: somef.main colorama # from: IPython.terminal.interactiveshell, click._compat, jed... cryptography # from: OpenSSL.SSL, OpenSSL._util, OpenSSL.crypto, paramiko.... cycler # from: matplotlib.pyplot, matplotlib.rcsetup dateutil # from: ipyparallel.util, jupyter_client.jsonutil, jupyter_cl... decorator # from: IPython.core.formatters, IPython.core.history, IPytho... defusedxml # from: nbconvert.filters.strings dill # from: ipykernel.pickleutil, ipyparallel.client._joblib, ipy... funcsigs # from: _pytest.compat html5lib # from: bs4.builder._html5lib, rdflib.term idna # from: cryptography.x509.general_name, jsonschema._format, r... importlib_metadata # from: jsonschema, markdown.util ipykernel # from: IPython, IPython.core.display, IPython.core.pylabtool... ipyparallel # from: ipykernel.pickleutil, ipykernel.zmqshell ipython_genutils # from: ipykernel.comm.manager, ipykernel.connect, ipykernel.... ipywidgets # from: tqdm.notebook isodate # from: rdflib.term jedi # from: IPython.core.completer jinja2 # from: nbconvert.exporters.html, nbconvert.exporters.templat... joblib # from: ipyparallel.client._joblib, ipyparallel.client.view, ... jsonschema # from: nbformat.validator jupyter_client # from: IPython.kernel, ipykernel.connect, ipykernel.displayh... jupyter_core # from: ipykernel.kernelapp, ipykernel.zmqshell, jupyter_clie... kiwisolver # from: matplotlib._layoutbox lxml # from: bs4.builder._lxml, html5lib.treebuilders.etree_lxml, ... markdown # from: somef.cli, somef.createExcerpts markupsafe # from: jinja2, jinja2.Environment, jinja2.asyncsupport, jinj... matplotlib # from: IPython.core.pylabtools, ipykernel.pylab.backend_inli... mistune # from: nbconvert.filters.markdown_mistune more_itertools # from: _pytest.python_api mpi4py # from: ipyparallel.engine.engine nacl # from: paramiko.ed25519key nbconvert # from: IPython.utils.io, notebook.services.contents.filemana... nbformat # from: IPython.core.interactiveshell, IPython.core.magics.ba... networkx # from: nltk.parse.DependencyGraph, nltk.parse.dependencygraph nltk # from: somef.configuration, textblob.base, textblob.blob, te... nose # from: IPython.external.decorators._decorators, IPython.exte... notebook # from: IPython.html, IPython.utils.io, nbconvert.preprocesso... numpy # from: IPython.core.formatters, IPython.core.magics.namespac... pandas # from: networkx.convert, networkx.convert_matrix, sklearn.ut... paramiko # from: zmq.ssh.tunnel parso # from: jedi.api, jedi.api.classes, jedi.api.completion, jedi... pathlib2 # from: PIL.Image, _pytest.pathlib, importlib_metadata._compa... pexpect # from: IPython.utils._process_posix, zmq.ssh.tunnel pickleshare # from: IPython.core.interactiveshell pluggy # from: _pytest._code.code, _pytest.config, _pytest.hookspec prometheus_client # from: notebook.base.handlers, notebook.metrics prompt_toolkit # from: IPython.terminal.debugger, IPython.terminal.interacti... psutil # from: joblib.externals.loky.backend.utils, joblib.externals... ptyprocess # from: pexpect.pty_spawn, terminado.management pvectorc # from: pyrsistent._pvector py # from: _pytest._code.code, _pytest._code.source, _pytest.ass... pycparser # from: cffi.cparser pygments # from: IPython.core.oinspect, IPython.lib.display, IPython.l... pygraphviz # from: networkx.drawing.nx_agraph pyrsistent # from: jsonschema._types pytest # from: _pytest.compat, _pytest.config, _pytest.fixtures, _py... pytz # from: pandas.core.arrays.datetimes, pandas.core.dtypes.dtyp... rdflib # from: somef.data_to_graph, somef.schema.software_schema regex # from: nltk.classify.textcat, nltk.tokenize.casual requests # from: jsonschema.validators, nltk.parse.corenlp, somef.cli,... scandir # from: pathlib2 scipy # from: networkx.algorithms.assortativity.correlation, networ... send2trash # from: notebook.services.contents.filemanager setuptools # from: cffi.ffiplatform, numpy.distutils.command.bdist_rpm, ... sip # from: IPython.external.qt_loaders six # from: OpenSSL.SSL, OpenSSL._util, OpenSSL.crypto, pytest.... sklearn # from: nltk.classify.scikitlearn, nltk.parse.transitionparser soupsieve # from: bs4.element sqlalchemy # from: pandas.io.sql terminado # from: notebook.terminal, notebook.terminal.handlers testpath # from: IPython.testing.plugin.ipdoctest, nbconvert.exporters... textblob # from: somef.header_analysis threadpoolctl # from: sklearn.cluster._kmeans tornado # from: ipykernel.ipkernel, ipykernel.kernelapp, ipykernel.ke... tqdm # from: nltk.util traitlets # from: IPython.core.alias, IPython.core.application, IPython... typing_extensions # from: sklearn.externals._arff urllib3 # from: requests, requests.adapters, requests.exceptions, req... wcwidth # from: IPython.terminal.ptutils, prompt_toolkit.utils webencodings # from: html5lib._inputstream yaml # from: networkx.readwrite.nx_yaml, nltk.data zipp # from: importlib_metadata zmq # from: ipykernel.eventloops, ipykernel.heartbeat, ipykernel....
Will we know which version should be used? Sometimes there are incompatibilities. I think this is a potential interesting problem. We should definitely have this issue here and revisit it later (e.g., when creating Dockerfiles that build and run a component).
Another tool that I found for this is pipreqs
pipreqs code_inspector --- it automatically generates a file called requirements.txt
requirements.txt generated by pipreqs: cdmcfparser==2.3.2 Click==7.0 matplotlib==3.1.1 astor==0.8.1 graphviz==0.16 docstring_parser==0.7.3 networkx==2.5.1
Original requirements. txt - done by us manually cdmcfparser==2.3.2 docstring_parser==0.7 astor graphviz click
Note: matplotlib==3.1.1 and networkx are used by code_visualization.py (and it is true that those libraries are needed to run this script.)
very cool. This one: https://github.com/jupyterhub/repo2docker is the one used to create Dockerfiles, so I guess it may use it underneath it.
Simple fully compatible pipreqs wrapper that supports python files and jupyter notebook pipreqsnb
Blog comparing dependency libraries tools
I run with code_inspector pigar (removing code_visualization.py) and this is the requirements.txt. (It is also very good!)
pigar -P code_inspector requirements_pigar.txt
Notes about Pigar:
Pigar is also compatible with Notebooks. Another interesting blog using pigar is this one
Pigar uses the method of parsing ast, rather than regular expression, and can easily extract the dependency library from the document test of exec / Eval parameters and document strings.
In addition, Pigar can well support the differences between different Python versions. For example, concurrent.futures It is the standard library of Python 3.2 +. In previous versions, the three-party Library Futures needs to be installed to use it. Pigar can distinguish effectively. (PS: pipreqs also supports this identification. For details, please refer to this integration: https://github.com/bndr/pipreqs/pull/80 )
-- Quick Test: Using Pigar and Pyreqs for Somef ---
pigar -P /Users/rosafilgueira/HW-Work/Research/somef somef_requirements_piga.txt
pyreqs /Users/rosafilgueira/HW-Work/Research/somef somef_requirements_pyreqs.txt
I think for this we should probably have a corpus of tricky ones and give it a try. I am sure they will all work with code inspector, but once you start digging we will find some issues.
Ok! I have integrated Pigar in code_inspector. For doing that, I have created a method called find_requirements in which we call Pigar (it will be very easy to call to another tool if we want to in the future).
The requirements are integrated in the final json file :) ! And I delete the intermediare "txt file" created by Pigar - but in the future we might want to keep it - since we might have to use it (to install the requirements and later try to execute automatically a repository).
Pigar has an option to search PyPI for the missing modules and filter some unnecessary modules, answering y\N. I have configured it to answer 'y' ... but we might want to change this in the future (answering to 'N'). It will be also easy to do it as an additional parameter.
Furthermore, I have created an additional argument (-r) to indicate if we want to find the requirements (by default it doesnt do it).
For example: python code_inspector.py -i /Users/rosafilgueira/HW-Work/Research/somef -r
Already supported by integrating pigar
Using the module graph dependencies, we could use it to generate which are the modules that we need to install and build the requirements.txt automatically.