EducationalTestingService / skll

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.
http://skll.readthedocs.org
Other
550 stars 69 forks source link

Convert modules into sub-packages #601

Closed desilinguist closed 4 years ago

desilinguist commented 4 years ago

This PR closes #600.

The various SKLL modules learner.py, experiments.py, config.py etc. were getting really long as single files and unwieldy when it came to adding new development. This PR tries to address this issue by converting these modules into sub-packages instead with specific functions moved to different modules under these sub-package.

Here's what the SKLL code tree looks like today:

.
├── __init__.py
├── config.py
├── data
│   ├── __init__.py
│   ├── dict_vectorizer.py
│   ├── featureset.py
│   ├── readers.py
│   ├── writers.py
├── experiments.py
├── learner.py
├── logutils.py
├── metrics.py
├── utilities
│   ├── __init__.py
│   ├── compute_eval_from_predictions.py
│   ├── filter_features.py
│   ├── generate_predictions.py
│   ├── join_features.py
│   ├── plot_learning_curves.py
│   ├── print_model_weights.py
│   ├── run_experiment.py
│   ├── skll_convert.py
│   └── summarize_results.py
└── version.py

13 directories, 84 files

With this PR, it will look like:

.
├── __init__.py
├── config
│   ├── __init__.py
│   └── utils.py
├── data
│   ├── __init__.py
│   ├── dict_vectorizer.py
│   ├── featureset.py
│   ├── readers.py
│   ├── writers.py
├── experiments
│   ├── __init__.py
│   ├── input.py
│   ├── output.py
│   └── utils.py
├── learner
│   ├── __init__.py
│   └── utils.py
├── metrics.py
├── utils
│   ├── __init__.py
│   ├── commandline
│   │   ├── __init__.py
│   │   ├── compute_eval_from_predictions.py
│   │   ├── filter_features.py
│   │   ├── generate_predictions.py
│   │   ├── join_features.py
│   │   ├── plot_learning_curves.py
│   │   ├── print_model_weights.py
│   │   ├── run_experiment.py
│   │   ├── skll_convert.py
│   │   └── summarize_results.py
│   ├── constants.py
│   └── logging.py
└── version.py

12 directories, 91 files

In addition, some functions that were previously indicated to be private (with a leading underscore) are now public since they can actually be quite useful as part of the API. Some examples include experiments.input.load_featureset() and experiments.output.generate_learning_curve_plots().

Specifically, this PR:

pep8speaks commented 4 years ago

Hello @desilinguist! Thanks for updating this PR.

Line 41:101: E501 line too long (113 > 100 characters)

Line 769:101: E501 line too long (106 > 100 characters)

Line 1115:101: E501 line too long (114 > 100 characters) Line 1117:101: E501 line too long (114 > 100 characters) Line 1119:101: E501 line too long (117 > 100 characters) Line 1165:101: E501 line too long (114 > 100 characters) Line 1167:101: E501 line too long (114 > 100 characters) Line 1169:101: E501 line too long (117 > 100 characters)

Comment last updated at 2020-04-11 14:00:59 UTC
codecov[bot] commented 4 years ago

Codecov Report

Merging #601 into master will increase coverage by 0.10%. The diff coverage is 96.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #601      +/-   ##
==========================================
+ Coverage   95.06%   95.16%   +0.10%     
==========================================
  Files          20       26       +6     
  Lines        2977     3021      +44     
==========================================
+ Hits         2830     2875      +45     
+ Misses        147      146       -1     
Impacted Files Coverage Δ
skll/metrics.py 96.87% <ø> (-0.27%) :arrow_down:
...utils/commandline/compute_eval_from_predictions.py 97.18% <ø> (ø)
skll/utils/commandline/filter_features.py 98.41% <ø> (ø)
skll/utils/commandline/generate_predictions.py 98.59% <ø> (ø)
skll/utils/commandline/join_features.py 98.14% <ø> (ø)
skll/utils/commandline/print_model_weights.py 94.91% <ø> (ø)
skll/utils/commandline/run_experiment.py 96.77% <ø> (ø)
skll/experiments/utils.py 93.51% <93.51%> (ø)
skll/config/utils.py 96.00% <96.00%> (ø)
skll/learner/utils.py 96.31% <96.31%> (ø)
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1d64e25...996d44f. Read the comment docs.