Open nielsbauman opened 2 years ago
Motivation: can we improve maintainability and robustness relating to dependencies
Data:
Motivating example: Lots of machine learning projects and paper implementation are written in python2. However, python2 is no longer supported. It is difficult to change to python 3, difficult to install dependency for python2. If there is a security vulnerability in the code, it's difficult to maintain or fix the existing systems that use depreciated dependencies.
Tool to extract module dependencies https://pypi.org/project/findimports/
List of projects can be found in this list of over 4000 repositories from this Microsoft paper
Methodology:
Dependencies Look at practicality (what is the problem we are solving) Too many dependencies - impact on deployment etc. DS pulls in as many dependencies to make things work SE remove dependencies Example: pandas may not be needed but is commonly pulled in in DS projects but can find ways to not include Evolution aspect From build to deployment Dependencies bw ML micro services (cross boundary issues)
We could try to identify how dependencies affect the maintainability of an ML project.
Possible approaches: