dsynkov / spark-livy-on-airflow-workspace

A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.
39 stars 18 forks source link

Suggest to loosen the dependency on requests #3

Open Agnes-U opened 2 years ago

Agnes-U commented 2 years ago

Hi, your project spark-livy-on-airflow-workspace requires "requests==2.23.0" in its dependency. After analyzing the source code, we found that the following versions of requests can also be suitable without affecting your project, i.e., requests 2.22.0, 2.24.0. Therefore, we suggest to loosen the dependency on requests from "requests==2.23.0" to "requests>=2.22.0,<=2.24.0" to avoid any possible conflict for importing more packages or for downstream projects that may use spark-livy-on-airflow-workspace.

May I pull a request to further loosen the dependency on requests?

By the way, could you please tell us whether such dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

Agnes-U commented 2 years ago

For your reference, here are details in our analysis.

Your project spark-livy-on-airflow-workspace directly uses 1 APIs from package requests.

requests.sessions.Session.__init__

From which, 17 functions are then indirectly called, including 12 requests's internal APIs and 5 outsider APIs, as follows (neglecting some repeated function occurrences).

[/dsynkov/spark-livy-on-airflow-workspace]
+--requests.sessions.Session.__init__
|      +--requests.utils.default_headers
|      |      +--requests.structures.CaseInsensitiveDict.__init__
|      |      |      +--collections.OrderedDict
|      |      +--requests.utils.default_user_agent
|      +--requests.hooks.default_hooks
|      +--requests.cookies.cookiejar_from_dict
|      |      +--requests.cookies.RequestsCookieJar.__init__
|      |      +--requests.cookies.RequestsCookieJar.set_cookie
|      |      +--requests.cookies.create_cookie
|      +--collections.OrderedDict
|      +--requests.sessions.Session.mount
|      |      +--collections.OrderedDict.pop
|      +--requests.adapters.HTTPAdapter.__init__
|      |      +--urllib3.util.retry.Retry
|      |      +--urllib3.util.retry.Retry.from_int
|      |      +--requests.adapters.BaseAdapter.__init__
|      |      +--requests.adapters.HTTPAdapter.init_poolmanager
|      |      |      +--urllib3.poolmanager.PoolManager

We scan requests's versions and observe that during its evolution between any version from [2.22.0, 2.24.0] and 2.23.0, the changing functions (diffs being listed below) have none intersection with any function or API we mentioned above (either directly or indirectly called by this project).

diff: 2.23.0(original) 2.22.0
['requests.models.PreparedRequest.prepare_url', 'requests.api.head', 'requests.sessions.Session.get_adapter', 'requests.auth.HTTPDigestAuth.handle_401', 'requests.models.PreparedRequest', 'requests.auth.HTTPDigestAuth', 'requests.models.Response', 'requests.utils.to_key_val_list', 'requests.api.request', 'requests.utils.get_netrc_auth', 'requests.sessions.SessionRedirectMixin', 'requests.sessions.Session', 'requests.sessions.SessionRedirectMixin.rebuild_auth', 'requests.auth._basic_auth_str', 'requests.models.Response.__init__', 'requests.sessions.SessionRedirectMixin.resolve_redirects', 'requests.utils.from_key_val_list']

diff: 2.23.0(original) 2.24.0
['requests.exceptions.RequestsDependencyWarning', 'requests.models.PreparedRequest', 'requests.exceptions.StreamConsumedError', 'requests.exceptions.RequestsWarning', 'requests.models.Response.raise_for_status', 'requests.models.Response', 'requests.sessions.Session.send', 'requests.exceptions.UnrewindableBodyError', 'requests.models.PreparedRequest.prepare_body', 'requests.sessions.Session', 'requests.exceptions.ContentDecodingError', 'requests.exceptions.FileModeWarning']

As for other packages, the APIs of collections and urllib3 are called by requests in the call graph and the dependencies on these packages also stay the same in our suggested versions, thus avoiding any outside conflict.

Therefore, we believe that it is quite safe to loose your dependency on requests from "requests==2.23.0" to "requests>=2.22.0,<=2.24.0". This will improve the applicability of spark-livy-on-airflow-workspace and reduce the possibility of any further dependency conflict with other projects.