TakeLab / podium

Podium: a framework agnostic Python NLP library for data loading and preprocessing
http://takelab.fer.hr/podium
BSD 3-Clause "New" or "Revised" License
60 stars 2 forks source link

Removal of the "low interest" modules/packages #196

Closed mariosasko closed 3 years ago

mariosasko commented 4 years ago

IMO, there are some modules/packages that don't add any value to the project or are of low interest for English speaking users. By removing them, the codebase gets cleaner and we no longer have to maintain such modulues/packages. If we decide to keep them, our project will not have a clear direction. Consequently, it could affect the number of users in the long run.

The removal of the module/package is a 5-step process:

  1. See if the module/package is used in the project examples. This requires further discussion
  2. Remove the module (the .py file) or the package directory.
  3. Check the __init__.py in the parent directory and delete the related entries.
  4. Remove the corresponding test file/directory if there is one.
  5. Optionally, remove the doc entry if there is one.
  6. See if the removed module/package requires a specific dependency. If it does, remove the dependency from setup.py.

I'll edit this post to discuss and stage the module/package removal.

Feel free to comment.

cc @mttk @FilipBolt @ivansmokovic

mttk commented 4 years ago

Absolutely agreed. Removal of WIP modules is also something that can increase cleanliness, not only for English speaking users.

mariosasko commented 4 years ago

I propose removal of the following packages/modules:

Please note that I am not claiming that the packages/modules listed above need to be removed completely. If they are not hosted on any other platform, we can create a separate repo to hold them there.

mariosasko commented 4 years ago

One more thing. If we remove preproc/stemmer/* then we no longer need zip_safe=False in setup.py because, by doing so, no files will be left that are holding data. Having zip_safe set to True is preferred whenever possible. This stemmer (python source) is already published online.

mariosasko commented 4 years ago

One more reason to remove preproc/yake.py. Rn, PyPI doesn't support links to online repos in the list of required packages in setup.py. The yake library relies on this, but we don't really need this component (it doesn't interact with podium at all).

FilipBolt commented 3 years ago

Adding a few things here. Instead of complete removal since some downstream dependencies already depend on these libraries (and more will depend soon) let's try to move most of them a new respository (something like podium-models).

I propose that the only the podium/metrics/metrics.py is removed. The following modules I propose to move to the other (non-core repository):

Migrating these modules to a separate repository should allow for also migrating classes SCPDownloader (podium/storage/resources/downloader.py) and SCPLargeResource (podium/storage/resources/large_resource.py) which should significantly simplify the downloading and large resource modules.

mttk commented 3 years ago

More or less done IMO with the recent private transfer & changes.