Closed Mrglglglglgl closed 1 month ago
[!IMPORTANT]
Review skipped
Review was skipped due to path filters
:no_entry: Files ignored due to path filters (2)
* `docs/assets/images/analysis_modules/plots/histogram_plot.svg` is excluded by `!**/*.svg` * `docs/assets/images/analysis_modules/plots/line_plot.svg` is excluded by `!**/*.svg`CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including
**/dist/**
will override the default block on thedist
directory, by removing the pattern from both the lists.You can disable this status message by setting the
reviews.review_status
tofalse
in the CodeRabbit configuration file.
The pull request introduces a "Histogram Plot" feature in the pyretailscience
library, enhancing both documentation and functionality for histogram visualization. It adds a dedicated module for histogram plotting, updates the documentation to include usage examples, and introduces unit tests to ensure robustness. Changes span multiple files, including documentation updates, new functions for histogram creation and customization, and improvements to existing plotting utilities.
Files | Change Summary |
---|---|
docs/analysis_modules.md |
New section on "Histogram Plot" added, including an overview, analysis types, and example usage. |
pyretailscience/plots/histogram.py |
New module for histogram plotting with functions for creating and customizing histograms. |
tests/plots/test_histogram.py |
New unit tests for histogram plotting functionality, covering various scenarios and edge cases. |
docs/analysis_modules.md
, which is directly related to the new section added in PR #69 that elaborates on the functionality of product association rules in retail analytics.docs/analysis_modules.md
, which aligns with the updates made in PR #80 that introduce new sections and examples for various analytical tools, including those relevant to retail analytics.π° In the garden of data, we hop with delight,
New histograms bloom, a colorful sight.
With axes so clear and legends that sing,
Our plots tell a tale of each little thing.
So gather your data, letβs plot and explore,
With rabbits and histograms, who could ask for more? π₯
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
β±οΈ Estimated effort to review: 4 π΅π΅π΅π΅βͺ |
π§ͺ PR contains tests |
π No security concerns identified |
β‘ Key issues to review Performance Concern The `apply_range_clipping` function may be inefficient for large datasets as it creates a new DataFrame for each column. Consider using in-place operations or vectorized methods for better performance. Error Handling The function doesn't handle the case where `value_col` is an empty list. This could lead to unexpected behavior or errors. Test Coverage The tests don't cover all edge cases, such as empty DataFrames or Series, or invalid input types for `value_col` and `group_col`. |
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
Explore these optional code suggestions:
Category | Suggestion | Score |
Enhancement |
Add option to normalize histogram for better distribution comparison___ **Consider adding an option to normalize the histogram. This can be useful whencomparing distributions with different sample sizes. You can implement this by adding a normalize parameter and using density=True in the histogram plot if normalization is requested.** [pyretailscience/plots/histogram.py [54-69]](https://github.com/Data-Simply/pyretailscience/pull/85/files#diff-4a692039d06850d25663d85a0b4364598e6f6bab7df1141397f0469052c3635dR54-R69) ```diff def plot( df: pd.DataFrame | pd.Series, value_col: str | list[str] | None = None, group_col: str | None = None, title: str | None = None, x_label: str | None = None, y_label: str | None = None, legend_title: str | None = None, ax: Axes | None = None, source_text: str | None = None, move_legend_outside: bool = False, range_lower: float | None = None, range_upper: float | None = None, range_method: Literal["clip", "fillna"] = "clip", + normalize: bool = False, **kwargs: dict[str, any], ) -> SubplotBase: + # ... (existing code) + kwargs['density'] = normalize + ax = _plot_histogram( + df=df, value_col=value_col, group_col=group_col, ax=ax, cmap=cmap, num_histograms=num_histograms, **kwargs + ) + + # ... (rest of the existing code) + ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 9Why: Normalizing histograms is a valuable feature when comparing distributions with different sample sizes. This suggestion significantly enhances the functionality by providing an option to normalize the data, making comparisons more meaningful. | 9 |
Simplify range clipping logic by using infinity values instead of None___ **Consider usingnp.inf instead of None for range_upper and -np.inf for range_lower in the apply_range_clipping function. This would simplify the logic and make it more consistent, as you wouldn't need to check for None values separately.**
[pyretailscience/plots/histogram.py [176-204]](https://github.com/Data-Simply/pyretailscience/pull/85/files#diff-4a692039d06850d25663d85a0b4364598e6f6bab7df1141397f0469052c3635dR176-R204)
```diff
-if range_lower is not None or range_upper is not None:
- if range_method == "clip":
- return df.assign(**{col: df[col].clip(lower=range_lower, upper=range_upper) for col in value_col})
- return df.assign(
- **{
- col: df[col].apply(
- lambda x: np.nan
- if (range_lower is not None and x < range_lower) or (range_upper is not None and x > range_upper)
- else x
- )
- for col in value_col
- }
- )
+range_lower = -np.inf if range_lower is None else range_lower
+range_upper = np.inf if range_upper is None else range_upper
+if range_method == "clip":
+ return df.assign(**{col: df[col].clip(lower=range_lower, upper=range_upper) for col in value_col})
+return df.assign(
+ **{
+ col: df[col].apply(
+ lambda x: np.nan if x < range_lower or x > range_upper else x
+ )
+ for col in value_col
+ }
+)
```
- [ ] **Apply this suggestion**
Suggestion importance[1-10]: 8Why: This suggestion simplifies the logic by using `np.inf` and `-np.inf` instead of checking for `None`, making the code cleaner and potentially reducing errors related to boundary conditions. | 8 | |
Implement data binning for more flexible and meaningful histogram creation___ **Consider usingpandas.cut() function to bin the data before plotting. This can be useful for creating more meaningful histograms, especially when dealing with continuous data or when you want to control the number of bins.** [pyretailscience/plots/histogram.py [228-266]](https://github.com/Data-Simply/pyretailscience/pull/85/files#diff-4a692039d06850d25663d85a0b4364598e6f6bab7df1141397f0469052c3635dR228-R266) ```diff def _plot_histogram( df: pd.DataFrame, value_col: list[str], group_col: str | None, ax: Axes | None, cmap: ListedColormap, num_histograms: int, + bins: int | list = 10, **kwargs: dict, ) -> Axes: add_legend = num_histograms > 1 if group_col is None: - return df[value_col].plot(kind="hist", ax=ax, alpha=0.5, legend=add_legend, color=cmap.colors[0], **kwargs) + for col in value_col: + df[f'{col}_binned'] = pd.cut(df[col], bins=bins) + return df[[f'{col}_binned' for col in value_col]].plot(kind="hist", ax=ax, alpha=0.5, legend=add_legend, color=cmap.colors[0], **kwargs) df_pivot = df.pivot(columns=group_col, values=value_col[0]) + for col in df_pivot.columns: + df_pivot[f'{col}_binned'] = pd.cut(df_pivot[col], bins=bins) # Plot all columns at once - return df_pivot.plot( + return df_pivot[[f'{col}_binned' for col in df_pivot.columns]].plot( kind="hist", ax=ax, alpha=0.5, legend=add_legend, color=cmap.colors[: len(df_pivot.columns)], # Use the appropriate number of colors **kwargs, ) ``` - [ ] **Apply this suggestion** Suggestion importance[1-10]: 7Why: Using `pandas.cut()` for binning can enhance the flexibility and interpretability of histograms, especially for continuous data. This suggestion adds value by allowing more control over the histogram bins. | 7 |
π‘ Need additional feedback ? start a PR chat
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/11066943589/job/30749236313) [β] |
**Failed test name:** ruff |
**Failure summary:**
The action failed because the ruff hook detected and fixed 21 errors related to missing trailing commas (COM812) in the following files: pyretailscience/plots/histogram.py : 5 errorstests/plots/test_histogram.py : 16 errorsThe ruff-format also failed as it reformatted 2 files. The process completed with exit code 1 due to these modifications. |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 465: [INFO] Once installed this environment will be reused. 466: [INFO] This may take a few minutes... 467: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 468: [INFO] Once installed this environment will be reused. 469: [INFO] This may take a few minutes... 470: [INFO] Installing environment for https://github.com/kynan/nbstripout. 471: [INFO] Once installed this environment will be reused. 472: [INFO] This may take a few minutes... 473: ruff.....................................................................Failed 474: - hook id: ruff 475: - files were modified by this hook 476: Fixed 21 errors: 477: - pyretailscience/plots/histogram.py: 478: 5 Γ COM812 (missing-trailing-comma) 479: - tests/plots/test_histogram.py: 480: 16 Γ COM812 (missing-trailing-comma) 481: Found 21 errors (21 fixed, 0 remaining). 482: ruff-format..............................................................Failed ... 486: 2 files reformatted, 25 files left unchanged 487: trim trailing whitespace.................................................Passed 488: fix end of files.........................................................Passed 489: fix python encoding pragma...............................................Passed 490: check yaml...............................................................Passed 491: debug statements (python)................................................Passed 492: pytest...................................................................Passed 493: nbstripout...............................................................Passed 494: ##[error]Process completed with exit code 1. ``` |
Attention: Patch coverage is 75.23810%
with 26 lines
in your changes missing coverage. Please review.
Files with missing lines | Patch % | Lines |
---|---|---|
pyretailscience/style/graph_utils.py | 36.66% | 18 Missing and 1 partial :warning: |
pyretailscience/plots/histogram.py | 87.93% | 3 Missing and 4 partials :warning: |
Files with missing lines | Coverage Ξ | |
---|---|---|
pyretailscience/plots/line.py | 95.65% <100.00%> (ΓΈ) |
|
pyretailscience/style/tailwind.py | 95.58% <100.00%> (+0.58%) |
:arrow_up: |
pyretailscience/plots/histogram.py | 87.93% <87.93%> (ΓΈ) |
|
pyretailscience/style/graph_utils.py | 67.70% <36.66%> (-14.39%) |
:arrow_down: |
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/11103261931/job/30844831400) [β] |
**Failed test name:** ruff-format |
**Failure summary:**
The action failed because the ruff-format check did not pass. Specifically:ruff-format . |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 467: [INFO] This may take a few minutes... 468: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 469: [INFO] Once installed this environment will be reused. 470: [INFO] This may take a few minutes... 471: [INFO] Installing environment for https://github.com/kynan/nbstripout. 472: [INFO] Once installed this environment will be reused. 473: [INFO] This may take a few minutes... 474: ruff.....................................................................Passed 475: ruff-format..............................................................Failed ... 479: 1 file reformatted, 26 files left unchanged 480: trim trailing whitespace.................................................Passed 481: fix end of files.........................................................Passed 482: fix python encoding pragma...............................................Passed 483: check yaml...............................................................Passed 484: debug statements (python)................................................Passed 485: pytest...................................................................Passed 486: nbstripout...............................................................Passed 487: ##[error]Process completed with exit code 1. ``` |
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/11128201459/job/30922371713) [β] |
**Failed test name:** test_plot_single_histogram_no_legend |
**Failure summary:**
The action failed due to multiple test failures:trim trailing whitespace pre-commit hook failed because it modified files to remove trailing whitespace. tests/plots/test_histogram.py failed due to AssertionError . The tests expected the plot function to be called with specific parameters, but the actual call included an unexpected alpha parameter and specific color values instead of a generic color placeholder.tests/test_standard_graphs.py failed with a TypeError because the get_base_cmap function was called with an unexpected keyword argument num_colors . |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 468: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 469: [INFO] Once installed this environment will be reused. 470: [INFO] This may take a few minutes... 471: [INFO] Installing environment for https://github.com/kynan/nbstripout. 472: [INFO] Once installed this environment will be reused. 473: [INFO] This may take a few minutes... 474: ruff.....................................................................Passed 475: ruff-format..............................................................Passed 476: trim trailing whitespace.................................................Failed ... 478: - exit code: 1 479: - files were modified by this hook 480: Fixing docs/assets/images/analysis_modules/plots/line_plot.svg 481: Fixing docs/assets/images/analysis_modules/plots/histogram_plot.svg 482: fix end of files.........................................................Passed 483: fix python encoding pragma...............................................Passed 484: check yaml...............................................................Passed 485: debug statements (python)................................................Passed 486: pytest...................................................................Failed ... 496: tests/plots/test_line.py ......... [ 16%] 497: tests/test_cross_shop.py ........ [ 21%] 498: tests/test_gain_loss.py ...................... [ 37%] 499: tests/test_options.py ...................... [ 54%] 500: tests/test_product_association.py ............... [ 64%] 501: tests/test_range_planning.py ........ [ 70%] 502: tests/test_segmentation.py ...................... [ 86%] 503: tests/test_standard_graphs.py .....FFFFFF....... [100%] 504: =================================== FAILURES =================================== 505: _____________________ test_plot_single_histogram_no_legend _____________________ 506: self = |
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/11128214192/job/30922469714) [β] |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 468: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 469: [INFO] Once installed this environment will be reused. 470: [INFO] This may take a few minutes... 471: [INFO] Installing environment for https://github.com/kynan/nbstripout. 472: [INFO] Once installed this environment will be reused. 473: [INFO] This may take a few minutes... 474: ruff.....................................................................Passed 475: ruff-format..............................................................Passed 476: trim trailing whitespace.................................................Failed ... 478: - exit code: 1 479: - files were modified by this hook 480: Fixing docs/assets/images/analysis_modules/plots/line_plot.svg 481: Fixing docs/assets/images/analysis_modules/plots/histogram_plot.svg 482: fix end of files.........................................................Passed 483: fix python encoding pragma...............................................Passed 484: check yaml...............................................................Passed 485: debug statements (python)................................................Passed 486: pytest...................................................................Failed ... 496: tests/plots/test_line.py ......... [ 16%] 497: tests/test_cross_shop.py ........ [ 21%] 498: tests/test_gain_loss.py ...................... [ 37%] 499: tests/test_options.py ...................... [ 54%] 500: tests/test_product_association.py ............... [ 64%] 501: tests/test_range_planning.py ........ [ 70%] 502: tests/test_segmentation.py ...................... [ 86%] 503: tests/test_standard_graphs.py .....FFFFFF....... [100%] 504: =================================== FAILURES =================================== 505: _____________________ test_plot_single_histogram_no_legend _____________________ 506: self = |
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/11140476098/job/30959319761) [β] |
**Failed test name:** test_plot_single_histogram_no_legend |
**Failure summary:**
The action failed due to multiple test failures:trim trailing whitespace pre-commit hook failed because it modified files to remove trailing whitespace. tests/plots/test_histogram.py failed due to an AssertionError . The tests expected the plot function to be called with specific parameters, but the actual call included an unexpected alpha parameter and specific color values.tests/test_standard_graphs.py failed with a TypeError because the get_base_cmap function was called with an unexpected keyword argument num_colors . |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 468: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 469: [INFO] Once installed this environment will be reused. 470: [INFO] This may take a few minutes... 471: [INFO] Installing environment for https://github.com/kynan/nbstripout. 472: [INFO] Once installed this environment will be reused. 473: [INFO] This may take a few minutes... 474: ruff.....................................................................Passed 475: ruff-format..............................................................Passed 476: trim trailing whitespace.................................................Failed ... 478: - exit code: 1 479: - files were modified by this hook 480: Fixing docs/assets/images/analysis_modules/plots/line_plot.svg 481: Fixing docs/assets/images/analysis_modules/plots/histogram_plot.svg 482: fix end of files.........................................................Passed 483: fix python encoding pragma...............................................Passed 484: check yaml...............................................................Passed 485: debug statements (python)................................................Passed 486: pytest...................................................................Failed ... 496: tests/plots/test_line.py ......... [ 16%] 497: tests/test_cross_shop.py ........ [ 21%] 498: tests/test_gain_loss.py ...................... [ 37%] 499: tests/test_options.py ...................... [ 54%] 500: tests/test_product_association.py ............... [ 64%] 501: tests/test_range_planning.py ........ [ 70%] 502: tests/test_segmentation.py ...................... [ 86%] 503: tests/test_standard_graphs.py .....FFFFFF....... [100%] 504: =================================== FAILURES =================================== 505: _____________________ test_plot_single_histogram_no_legend _____________________ 506: self = |
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/11150594039/job/30992086143) [β] |
**Failed test name:** ruff-format |
**Failure summary:**
The action failed due to the following reasons:ruff-format hook failed because it modified files during its execution. This indicates that the code did not meet the formatting standards enforced by the ruff-format hook.COM812 and ISC001 . It is recommended to disable these rules to avoid unexpected behavior.trim trailing whitespace hook also failed, suggesting that some files contained trailing whitespace that needed to be removed. |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 467: [INFO] This may take a few minutes... 468: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 469: [INFO] Once installed this environment will be reused. 470: [INFO] This may take a few minutes... 471: [INFO] Installing environment for https://github.com/kynan/nbstripout. 472: [INFO] Once installed this environment will be reused. 473: [INFO] This may take a few minutes... 474: ruff.....................................................................Passed 475: ruff-format..............................................................Failed 476: - hook id: ruff-format 477: - files were modified by this hook 478: warning: The following rules may cause conflicts when used with the formatter: `COM812`, `ISC001`. To avoid unexpected behavior, we recommend disabling these rules, either by removing them from the `select` or `extend-select` configuration, or adding them to the `ignore` configuration. 479: 1 file reformatted, 26 files left unchanged 480: trim trailing whitespace.................................................Failed ... 484: Fixing docs/assets/images/analysis_modules/plots/line_plot.svg 485: Fixing docs/assets/images/analysis_modules/plots/histogram_plot.svg 486: fix end of files.........................................................Passed 487: fix python encoding pragma...............................................Passed 488: check yaml...............................................................Passed 489: debug statements (python)................................................Passed 490: pytest...................................................................Passed 491: nbstripout...............................................................Passed 492: ##[error]Process completed with exit code 1. ``` |
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.
**Action:** Pre-Commit |
**Failed stage:** [Run Pre-commit](https://github.com/Data-Simply/pyretailscience/actions/runs/11158732636/job/31015645157) [β] |
**Failed test name:** trim trailing whitespace |
**Failure summary:**
The action failed because the check 'trim trailing whitespace' did not pass. This check is responsible for ensuring that there are no trailing whitespace characters in the files. The failure indicates that some files contained trailing whitespace that needed to be removed. |
Relevant error logs:```yaml 1: ##[group]Operating System 2: Ubuntu ... 468: [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks. 469: [INFO] Once installed this environment will be reused. 470: [INFO] This may take a few minutes... 471: [INFO] Installing environment for https://github.com/kynan/nbstripout. 472: [INFO] Once installed this environment will be reused. 473: [INFO] This may take a few minutes... 474: ruff.....................................................................Passed 475: ruff-format..............................................................Passed 476: trim trailing whitespace.................................................Failed ... 480: Fixing docs/assets/images/analysis_modules/plots/line_plot.svg 481: Fixing docs/assets/images/analysis_modules/plots/histogram_plot.svg 482: fix end of files.........................................................Passed 483: fix python encoding pragma...............................................Passed 484: check yaml...............................................................Passed 485: debug statements (python)................................................Passed 486: pytest...................................................................Passed 487: nbstripout...............................................................Passed 488: ##[error]Process completed with exit code 1. ``` |
@Mrglglglglgl lgtm. Feel free to merge once you've fixed the pre-commit issues
PR Type
Enhancement, Tests, Documentation
Description
Changes walkthrough π
histogram.py
Add histogram plotting functionality with customization options
pyretailscience/plots/histogram.py
Series.
categorical column.
options.
graph_utils.py
Enhance graph styling for better visualization
pyretailscience/style/graph_utils.py
test_histogram.py
Add tests for histogram plot functionality
tests/plots/test_histogram.py
analysis_modules.md
Document histogram plot usage and features
docs/analysis_modules.md
histogram.md
Add API documentation for histogram plot
docs/api/plots/histogram.md - Added API documentation for the histogram plot module.
mkdocs.yml
Update documentation navigation for histogram plot
mkdocs.yml
Summary by CodeRabbit
New Features
Documentation
Tests