Closed mvanwyk closed 5 months ago
The recent changes across various pyretailscience
modules and related unit tests introduce substantial updates to linting configurations, docstrings, error messages, methods, and overall code simplification. The updates enhance code readability, maintainability, and functionality through improved documentation, added type hints, new methods, and refined class logic. The configuration file pyproject.toml
also received extensive modifications to incorporate updated linting and formatting standards.
File(s) | Change Summary |
---|---|
pyproject.toml |
Added py310 as target-version , configured various linting settings including per-file ignores. |
pyretailscience/cross_shop.py |
Added docstring, modified error message, simplified return statement, and updated font styling. |
pyretailscience/data/simulation.py |
Enhanced docstrings, refined methods, handled YAML errors, and adjusted transaction handling. |
pyretailscience/range_planning.py |
Added new method for range planning, updated existing methods and documentation. |
pyretailscience/segmentation.py |
Refined segmentation logic, handled special cases, improved plot method. |
pyretailscience/standard_graphs.py |
Updated import sources, added type hints, and improved function documentation. |
pyretailscience/style/graph_utils.py |
Added type hints, additional parameters, and updated function signatures and documentation. |
pyretailscience/style/tailwind.py |
Added Tailwind CSS palettes, type annotations, improved error messages, and code clarity. |
tests/data/test_contracts.py , tests/test_standard_graphs.py |
Refactored test functions with descriptive docstrings, handled dataset modifications. |
tests/test_range_planning.py |
Added docstrings to test methods, clarified test scenarios. |
pyretailscience/style/__init__.py |
Added a module-level docstring. |
(No sequence diagrams are generated as the changes are too varied and do not align with a single functional workflow visualization.)
In lines of code, new tales unfold,
With linting checks, so bright and bold.
Docstrings added, errors refined,
Improved logic, now well-defined.
Celebrate this coding spree,
As rabbits hop with glee! ๐โจ
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
โฑ๏ธ Estimated effort to review: 3 ๐ต๐ต๐ตโชโช |
๐งช PR contains tests |
๐ No security concerns identified |
โก Key issues to review **Possible Bug:** The use of a magic number '3' for `three_circles` in the `plot` method of the `CrossShop` class could be replaced with a class-level constant to improve readability and maintainability. **Refactoring Suggestion:** The `plot` method of the `CrossShop` class and other similar methods across modules are quite lengthy and complex. Consider breaking these methods into smaller, more manageable functions. **Code Style:** Consistent use of single quotes for strings is recommended unless the string contains a single quote character, in which case double quotes are acceptable. |
Category | Suggestion | Score |
Possible bug |
Add a check to prevent division by zero in Yule's Q coefficient calculation___ **To avoid potential division by zero errors in the calculation of Yule's Q coefficient, adda check to ensure the denominator is not zero before performing the division.** [pyretailscience/range_planning.py [151]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-9fcac5ed4c2c220c9b8ff7475c0cdea83955988eddddb02326af469a9c347142R151-R151) ```diff -q = (a * d - b * c) / (a * d + b * c) +denominator = (a * d + b * c) +q = (a * d - b * c) / denominator if denominator != 0 else 0 ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 9Why: This suggestion addresses a potential bug that could cause runtime errors, making it highly important. | 9 |
Add error handling for missing keys in dictionary access___ **Add error handling for KeyError to enhance robustness when accessing dictionary keys.** [pyretailscience/segmentation.py [244-245]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-5570b06720ade08659a5656535c9534d6d7777e157fca158434fde509661c0bdR244-R245) ```diff -text = diagram.subset_label_artists[subset_id] +text = diagram.subset_label_artists.get(subset_id, None) +if text is None: + raise KeyError(f"Subset ID {subset_id} not found in diagram subset label artists.") ``` - [ ] **Apply this suggestion**Suggestion importance[1-10]: 9Why: Adding error handling for KeyError enhances the robustness of the code, preventing potential runtime errors when accessing dictionary keys. This is an important improvement for code reliability. | 9 | |
Add error handling for zero input to avoid infinite loops and division errors___ **Add error handling for the case wherenum is zero to avoid infinite loops and potential division by zero errors in the human_format function.**
[pyretailscience/style/graph_utils.py [52-54]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-8aae344a2d3da24678fdbc11edb2234ec53ccd0a8a33c8e656c72f642103d2e4R52-R54)
```diff
+if num == 0:
+ return f"{prefix}0"
while abs(num) >= minimum_magnitude_difference:
magnitude += 1
num /= minimum_magnitude_difference
```
- [ ] **Apply this suggestion**
Suggestion importance[1-10]: 9Why: Adding error handling for zero input is crucial to prevent potential infinite loops and division by zero errors, which are significant issues. | 9 | |
Ensure boolean check is robust against non-boolean returns___ **Ensure the 'validate' method returns a boolean to prevent potential runtime errors if itreturns non-boolean values.** [pyretailscience/segmentation.py [65-67]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-5570b06720ade08659a5656535c9534d6d7777e157fca158434fde509661c0bdR65-R67) ```diff -if contract.validate() is False: +if not contract.validate(): msg = f"The dataframe requires the columns {required_cols} and they must be non-null and unique." raise ValueError(msg) ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 7Why: This suggestion ensures that the boolean check is more robust, preventing potential runtime errors if the validate method returns non-boolean values. This is a minor but useful improvement. | 7 | |
Enhancement |
Improve string formatting for consistency and readability___ **Replace the string concatenation with f-string for consistency and improved readability.** [pyretailscience/segmentation.py [66]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-5570b06720ade08659a5656535c9534d6d7777e157fca158434fde509661c0bdR66-R66) ```diff -msg = "The dataframe requires the columns " + required_cols + " and they must be non-null and unique." +msg = f"The dataframe requires the columns {required_cols} and they must be non-null and unique." ``` - [ ] **Apply this suggestion**Suggestion importance[1-10]: 8Why: Using f-strings improves readability and consistency in the code. This is a minor enhancement but beneficial for maintaining code quality. | 8 |
Simplify type conversion to improve code readability and performance___ **Replace the manual check with a direct use of pandas' built-in functionality for betterperformance and readability.** [pyretailscience/segmentation.py [103-105]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-5570b06720ade08659a5656535c9534d6d7777e157fca158434fde509661c0bdR103-R105) ```diff -if isinstance(group_1_idx, list): - group_1_idx = pd.Series(group_1_idx) -if isinstance(group_2_idx, list): - group_2_idx = pd.Series(group_2_idx) +group_1_idx = pd.Series(group_1_idx) if isinstance(group_1_idx, list) else group_1_idx +group_2_idx = pd.Series(group_2_idx) if isinstance(group_2_idx, list) else group_2_idx ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 6Why: The suggested change simplifies the type conversion logic, making the code more readable and potentially improving performance. This is a minor enhancement. | 6 | |
Use a dictionary for suffix mapping to enhance readability and maintainability___ **Consider using a dictionary to map magnitudes to their suffixes instead of a list forbetter readability and maintainability in the human_format function.**
[pyretailscience/style/graph_utils.py [57]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-8aae344a2d3da24678fdbc11edb2234ec53ccd0a8a33c8e656c72f642103d2e4R57-R57)
```diff
-return f"{prefix}%.{decimals}f%s" % (num, ["", "K", "M", "B", "T", "P"][magnitude])
+suffixes = {0: "", 1: "K", 2: "M", 3: "B", 4: "T", 5: "P"}
+return f"{prefix}%.{decimals}f{suffixes[magnitude]}"
```
- [ ] **Apply this suggestion**
Suggestion importance[1-10]: 6Why: Using a dictionary for suffix mapping can enhance readability and maintainability, though the improvement is relatively minor compared to the existing list implementation. | 6 | |
Possible issue |
Add a check for non-empty input arrays to prevent errors in calculations___ **Consider adding a check for non-empty arrays before performing calculations in_calculate_yules_q . This will prevent potential errors or unexpected behavior when working with empty arrays.** [tests/test_range_planning.py [19-43]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-ffe4812fdf0aad94e4c1a666854d513699907c0261c323bd4862b5bff907a19cR19-R43) ```diff -assert rp.CustomerDecisionHierarchy._calculate_yules_q(bought_product_1, bought_product_2) == expected_q +if bought_product_1.size > 0 and bought_product_2.size > 0: + assert rp.CustomerDecisionHierarchy._calculate_yules_q(bought_product_1, bought_product_2) == expected_q +else: + raise ValueError("Input arrays should not be empty.") ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 8Why: This suggestion addresses a potential issue where calculations might be performed on empty arrays, which could lead to unexpected behavior or errors. Adding this check improves the robustness of the tests. | 8 |
Add a check to ensure the DataFrame is not empty before processing___ **Ensure that the DataFramedf is not empty before proceeding with the get_indexes function to avoid potential errors when the DataFrame is empty.** [tests/test_standard_graphs.py [18-65]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-97449a16541d2074ceca37aed2b30181bd9a0db61e3f50bfb15352117aae5e57R18-R65) ```diff -output = get_indexes( +if not df.empty: + output = get_indexes( +else: + raise ValueError("Input DataFrame should not be empty.") ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 8Why: This suggestion helps prevent potential errors when the DataFrame is empty, ensuring that the function `get_indexes` only processes non-empty DataFrames. This adds a layer of validation and robustness to the tests. | 8 | |
Add a check to ensure the dataset is not empty before testing___ **It's recommended to handle the case where the DataFramedataset might be empty to prevent errors during testing.** [tests/data/test_contracts.py [56-58]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-e5fec5aa696ec23c40292c824574fc054450ccad609749228c4c9674aad632c8R56-R58) ```diff -test_contract = contracts.ContractBase(dataset) +if not dataset.empty: + test_contract = contracts.ContractBase(dataset) +else: + raise ValueError("Dataset should not be empty.") ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 8Why: This suggestion addresses a potential issue where an empty dataset could cause errors during testing. Adding this check ensures that the tests are only run with valid, non-empty datasets, improving the reliability of the tests. | 8 | |
Best practice |
Add type annotations to improve clarity and consistency___ **Add type annotations for thenot_none function parameters to improve code clarity and consistency.** [pyretailscience/style/graph_utils.py [108]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-8aae344a2d3da24678fdbc11edb2234ec53ccd0a8a33c8e656c72f642103d2e4R108-R108) ```diff -def not_none(value1: any, value2: any) -> any: +def not_none(value1: Optional[Any], value2: Optional[Any]) -> Optional[Any]: ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 8Why: Adding type annotations improves code clarity and consistency, making it easier for developers to understand and maintain the code. | 8 |
Maintainability |
Use a method to determine default labels based on orientation to enhance readability___ **Replace the manual checks for orientation with a method that returns default labels basedon the orientation. This will improve code readability and maintainability.** [pyretailscience/range_planning.py [256-258]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-9fcac5ed4c2c220c9b8ff7475c0cdea83955988eddddb02326af469a9c347142R256-R258) ```diff -default_x_label, default_y_label = ( - ("Products", "Distance") if orientation in ["top", "bottom"] else ("Distance", "Products") -) +default_x_label, default_y_label = self.get_default_labels(orientation) ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 7Why: This change improves code readability and maintainability by reducing manual checks, but it is not critical. | 7 |
Replace a hard-coded value with a constant for better maintainability___ **Replace the hard-coded value1000.0 in the minimum_magnitude_difference variable with a constant defined at the module level to improve maintainability and readability.** [pyretailscience/style/graph_utils.py [49]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-8aae344a2d3da24678fdbc11edb2234ec53ccd0a8a33c8e656c72f642103d2e4R49-R49) ```diff -minimum_magnitude_difference = 1000.0 +MIN_MAGNITUDE_DIFF = 1000.0 +minimum_magnitude_difference = MIN_MAGNITUDE_DIFF ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 7Why: This suggestion improves maintainability and readability by replacing a hard-coded value with a constant, making the code easier to update and understand. | 7 | |
Encapsulate graph style settings into a method for consistent usage across different plots___ **To ensure that theGraphStyles class is used consistently, replace the direct use of GraphStyles properties with a method that encapsulates style settings. This method can be used to apply consistent styling across different plotting functions.** [pyretailscience/range_planning.py [262-264]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-9fcac5ed4c2c220c9b8ff7475c0cdea83955988eddddb02326af469a9c347142R262-R264) ```diff -fontproperties=GraphStyles.POPPINS_SEMI_BOLD, -fontsize=GraphStyles.DEFAULT_TITLE_FONT_SIZE, -pad=GraphStyles.DEFAULT_TITLE_PAD + 5, +**self.apply_graph_style('title') ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 6Why: This suggestion improves maintainability by centralizing style settings, but it is not crucial for functionality. | 6 | |
Centralize setting of tick properties into a method to improve code reuse and consistency___ **Instead of using a manual approach to set the tick label properties, define a method thatapplies these properties based on the orientation. This method can be reused in multiple places and ensures consistency.** [pyretailscience/range_planning.py [324-326]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-9fcac5ed4c2c220c9b8ff7475c0cdea83955988eddddb02326af469a9c347142R324-R326) ```diff -for tick in ax.get_xticklabels(): - tick.set_fontproperties(GraphStyles.POPPINS_REG) -for tick in ax.get_yticklabels(): - tick.set_fontproperties(GraphStyles.POPPINS_REG) +self.set_tick_properties(ax, orientation) ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 6Why: This suggestion enhances maintainability by centralizing repetitive code, but it is not essential for functionality. | 6 | |
Performance |
Use list comprehension to optimize the creation of the colors list___ **Optimize the creation of thecolors list by using list comprehension directly instead of nested loops.** [pyretailscience/style/tailwind.py [370-372]](https://github.com/Data-Simply/pyretailscience/pull/54/files#diff-2c47772736e3e8449b8c8828d98947bcf1c1335e85c4b08e9f345db9e88b0469R370-R372) ```diff -colors = [] -for color_number in color_numbers: - for color in color_order: - colors.append(COLORS[color][color_number]) +colors = [COLORS[color][color_number] for color_number in color_numbers for color in color_order] ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 7Why: This suggestion improves the performance and readability of the code by replacing nested loops with a list comprehension. While the improvement is minor, it makes the code cleaner and potentially faster. | 7 |
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/9780436243/job/27002093155) [โ] |
**Failure summary:**
The action failed because the ruff linter found several issues in the code:tests/test_gain_loss.py had multiple issues including missing docstrings, unnecessary assignments, and usage of magic values. tests/test_cross_shop.py had issues such as missing trailing commas, incorrect pytest fixture style, and unsorted imports. |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 505: [INFO] Once installed this environment will be reused. 506: [INFO] This may take a few minutes... 507: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 508: [INFO] Once installed this environment will be reused. 509: [INFO] This may take a few minutes... 510: [INFO] Installing environment for https://github.com/kynan/nbstripout. 511: [INFO] Once installed this environment will be reused. 512: [INFO] This may take a few minutes... 513: ruff.....................................................................Failed ... 566: tests/test_gain_loss.py:8:5: D103 Missing docstring in public function 567: tests/test_gain_loss.py:33:12: RET504 Unnecessary assignment to `df` before `return` statement 568: tests/test_gain_loss.py:36:5: D103 Missing docstring in public function 569: tests/test_gain_loss.py:40:48: PLR2004 Magic value used in comparison, consider replacing `2` with a constant variable 570: tests/test_gain_loss.py:70:5: D103 Missing docstring in public function 571: tests/test_gain_loss.py:74:48: PLR2004 Magic value used in comparison, consider replacing `2` with a constant variable 572: tests/test_gain_loss.py:108:5: D103 Missing docstring in public function 573: tests/test_gain_loss.py:123:39: PLR2004 Magic value used in comparison, consider replacing `2` with a constant variable 574: Fixed 61 errors: ... 586: - tests/test_cross_shop.py: 587: 5 ร COM812 (missing-trailing-comma) 588: 1 ร PT001 (pytest-fixture-incorrect-parentheses-style) 589: 1 ร I001 (unsorted-imports) 590: - tests/test_gain_loss.py: 591: 26 ร Q000 (bad-quotes-inline-string) 592: 9 ร COM812 (missing-trailing-comma) 593: 1 ร PT001 (pytest-fixture-incorrect-parentheses-style) 594: Found 118 errors (61 fixed, 57 remaining). ... 596: ruff-format..............................................................Passed 597: trim trailing whitespace.................................................Passed 598: fix end of files.........................................................Passed 599: fix python encoding pragma...............................................Passed 600: check yaml...............................................................Passed 601: debug statements (python)................................................Passed 602: pytest...................................................................Passed 603: nbstripout...............................................................Passed 604: ##[error]Process completed with exit code 1. ``` |
PR Type
Enhancement, Tests, Documentation
Description
Changes walkthrough ๐
7 files
cross_shop.py
Add docstrings and refactor CrossShop class
pyretailscience/cross_shop.py
simulation.py
Add docstrings and refactor Simulation module
pyretailscience/data/simulation.py
range_planning.py
Add docstrings and refactor RangePlanning module
pyretailscience/range_planning.py
segmentation.py
Add docstrings and refactor Segmentation module
pyretailscience/segmentation.py
standard_graphs.py
Add docstring and refactor Standard Graphs module
pyretailscience/standard_graphs.py
graph_utils.py
Add docstring and refactor Graph Utils module
pyretailscience/style/graph_utils.py
tailwind.py
Add docstring and refactor Tailwind module
pyretailscience/style/tailwind.py
1 files
__init__.py
Add docstring to style module
pyretailscience/style/__init__.py - Added module docstring.
3 files
test_contracts.py
Add docstrings and improve test coverage for contracts
tests/data/test_contracts.py - Added module and function docstrings. - Improved test coverage.
test_range_planning.py
Add docstrings and improve test coverage for range planning
tests/test_range_planning.py - Added module and function docstrings. - Improved test coverage.
test_standard_graphs.py
Add docstrings and improve test coverage for standard graphs
tests/test_standard_graphs.py - Added module and function docstrings. - Improved test coverage.
1 files
pyproject.toml
Add linting configuration to pyproject.toml
pyproject.toml - Added linting requirements and configuration for Ruff.
Summary by CodeRabbit
Documentation
CrossShop
,CustomerDecisionHierarchy
,TransactionGenerator
,Customer
,Simulation
, and various test files.Refactor
Style
Tests
Chores