minimaxir / download-tweets-ai-text-gen

Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation.
MIT License
219 stars 41 forks source link

Suggest to loosen the dependency on twint #37

Open Agnes-U opened 2 years ago

Agnes-U commented 2 years ago

Hi, your project download-tweets-ai-text-gen requires "twint==2.1.4" in its dependency. After analyzing the source code, we found that the following versions of twint can also be suitable without affecting your project, i.e., twint 2.1.3, 2.1.5, 2.1.6, 2.1.7. Therefore, we suggest to loosen the dependency on twint from "twint==2.1.4" to "twint>=2.1.3,<=2.1.7" to avoid any possible conflict for importing more packages or for downstream projects that may use download-tweets-ai-text-gen.

May I pull a request to further loosen the dependency on twint?

By the way, could you please tell us whether such dependency analysis may be potentially helpful for maintaining dependencies easier during your development?



We also give our detailed analysis as follows for your reference:

Your project download-tweets-ai-text-gen directly uses 3 APIs from package twint.

twint.run.Lookup, twint.run.Search, twint.config.Config.__init__

Beginning from the 3 APIs above, 17 functions are then indirectly called, including 9 twint's internal APIs and 8 outsider APIs. The specific call graph is listed as follows (neglecting some repeated function occurrences).

[/minimaxir/download-tweets-ai-text-gen]
+--twint.run.Lookup
|      +--logging.debug
|      +--asyncio.get_event_loop
|      +--twint.storage.db.Conn
|      |      +--twint.storage.db.init
|      |      |      +--sqlite3.connect
|      |      +--sys.exit
+--twint.run.Search
|      +--logging.debug
|      +--twint.run.run
|      |      +--logging.debug
|      |      +--asyncio.get_event_loop
|      |      +--twint.run.Twint.__init__
|      |      |      +--logging.debug
|      |      |      +--twint.run.Twint.get_resume
|      |      |      |      +--os.path.exists
|      |      |      +--twint.storage.db.Conn
|      |      |      +--twint.datelock.Set
|      |      |      |      +--logging.debug
|      |      |      |      +--twint.datelock.Datelock.__init__
|      |      |      |      +--datetime.datetime.strptime
|      |      |      |      +--datetime.datetime.today
|      |      |      |      +--twint.datelock.convertToDateTime
|      |      |      +--twint.verbose.Elastic
|      |      |      +--twint.output.clean_follow_list
|      |      |      |      +--logging.debug
|      |      |      +--datetime.timedelta
+--twint.config.Config.__init__

We scan twint's versions and observe that during its evolution between any version from [2.1.3, 2.1.5, 2.1.6, 2.1.7] and 2.1.4, the changing functions (diffs being listed below) have none intersection with any function or API we mentioned above (either directly or indirectly called by this project).

diff: 2.1.4(original) 2.1.3
['twint.url._formatDate']

diff: 2.1.4(original) 2.1.5
['twint.storage.elasticsearch.createIndex']

diff: 2.1.4(original) 2.1.6
['twint.storage.elasticsearch.createIndex']

diff: 2.1.4(original) 2.1.7
['twint.user.media', 'twint.user.stat', 'twint.storage.elasticsearch.createIndex', 'twint.get.MobileRequest', 'twint.user.card', 'twint.output._output', 'twint.format.User', 'twint.tweet.Tweet', 'twint.run.Twint']

As for other packages, the APIs of logging, asyncio, sys, sqlite3, datetime and os are called by twint in the call graph and the dependencies on these packages also stay the same in our suggested versions, thus avoiding any outside conflict.

Therefore, we believe that it is quite safe to loose your dependency on twint from "twint==2.1.4" to "twint>=2.1.3,<=2.1.7". This will improve the applicability of download-tweets-ai-text-gen and reduce the possibility of any further dependency conflict with other projects.