Data-Simply / pyretailscience

pyretailscience - A data analysis and science toolkit for detail data
Other
5 stars 1 forks source link

feat: added waterfall plot #63

Closed mvanwyk closed 4 months ago

mvanwyk commented 4 months ago

PR Type

Enhancement, Documentation


Description


Changes walkthrough πŸ“

Relevant files
Enhancement
standard_graphs.py
Add waterfall plot function and enhance index plot             

pyretailscience/standard_graphs.py
  • Added a new function waterfall_plot to generate waterfall charts.
  • Enhanced index_plot function to use gu.add_source_text and
    gu.standard_tick_styles.
  • +113/-9 
    graph_utils.py
    Add padding options and bar label font size to GraphStyles

    pyretailscience/style/graph_utils.py
  • Added default bar label font size to GraphStyles.
  • Updated standard_graph_styles to include padding options for title and
    axis labels.
  • +11/-3   
    extra.css
    Add CSS class for clearing floats                                               

    docs/stylesheets/extra.css - Added a new CSS class `.clear` to clear floats.
    +6/-0     
    Documentation
    analysis_modules.md
    Document waterfall plot function with example                       

    docs/analysis_modules.md
  • Added documentation for the new waterfall_plot function.
  • Included an example usage of the waterfall_plot function.
  • +44/-1   
    Configuration changes
    mkdocs.yml
    Update mkdocs configuration with new markdown extensions 

    mkdocs.yml - Added new markdown extensions for better documentation formatting.
    +11/-0   

    πŸ’‘ PR-Agent usage: Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    Summary by CodeRabbit

    coderabbitai[bot] commented 4 months ago

    Walkthrough

    The changes encompass enhancements in financial data visualization and documentation styling. Key updates include adding a comprehensive description and example of Waterfall plots in analysis_modules.md, introducing CSS rules for layout management, augmenting mkdocs.yml with markdown extensions, enabling branch coverage in tests, and extending standard_graphs.py with a new waterfall_plot function. Additionally, updates in graph_utils.py refine graph styles, while new test cases ensure functionality in test_standard_graphs.py.

    Changes

    Files Change Summary
    docs/analysis_modules.md Added content explaining Waterfall plots and their applications in financial data visualization.
    docs/stylesheets/extra.css Added CSS rule for class .clear:after to ensure proper clearing of floated elements.
    mkdocs.yml Introduced several markdown extensions to enhance documentation capabilities and specified Google analytics.
    pyproject.toml Included --cov-branch option in pytest configuration for branch coverage reporting.
    pyretailscience/standard_graphs.py Enhanced index_plot function and added a new waterfall_plot function for generating Waterfall charts.
    pyretailscience/style/graph_utils.py Updated GraphStyles class with a default font size and added padding parameters to standard_graph_styles.
    tests/test_standard_graphs.py Introduced tests for the new waterfall_plot function, covering various scenarios and edge cases.

    Poem

    Amidst the code where numbers play,
    A Waterfall Plot now holds sway.
    With bars and labels, clear and bright,
    Financial trends come to light.
    In tests and styles, improvements blend,
    Our graphs now shine from end to end.
    πŸ°βœ¨πŸ“Š


    Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

    Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)
    Tips ### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` - `@coderabbitai help me debug CodeRabbit configuration file.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai full review` to do a full review from scratch and review all the files again. - `@coderabbitai summary` to regenerate the summary of the PR. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configuration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.
    codiumai-pr-agent-pro[bot] commented 4 months ago

    PR Reviewer Guide πŸ”

    ⏱️ Estimated effort to review: 3 πŸ”΅πŸ”΅πŸ”΅βšͺβšͺ
    πŸ§ͺ No relevant tests
    πŸ”’ No security concerns identified
    ⚑ Key issues to review

    Error Handling
    The `waterfall_plot` function could benefit from additional error handling for input validation, such as ensuring that `amounts` and `labels` are not empty. Performance Concern
    The use of `df.apply` in line 336 might lead to performance issues for large datasets. Consider using vectorized operations or other efficient methods.
    codiumai-pr-agent-pro[bot] commented 4 months ago

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    βœ… Prevent division by zero in percentage calculation ___ **Consider using a more robust method for calculating percentage labels in the
    waterfall_plot to avoid division by zero errors when total_change is zero.** [pyretailscience/standard_graphs.py [376]](https://github.com/Data-Simply/pyretailscience/pull/63/files#diff-916c262a44bdf2986e4e15b87f1be27cc247394b424104b94269e16edcc25b1cR376-R376) ```diff -labels = df["amounts"].apply(lambda x: f"{x/total_change:.0%}") +labels = df["amounts"].apply(lambda x: f"{x/total_change:.0%}" if total_change != 0 else "0%") ``` `[Suggestion has been applied]`
    Suggestion importance[1-10]: 10 Why: This suggestion addresses a potential division by zero error, which is a critical bug that could cause the function to fail during execution.
    10
    Add validation to ensure all elements in 'amounts' are numeric ___ **The waterfall_plot function should validate the amounts list to ensure it contains
    only numeric values to prevent runtime errors during plotting. This can be achieved
    by adding a check at the beginning of the function.** [pyretailscience/standard_graphs.py [288]](https://github.com/Data-Simply/pyretailscience/pull/63/files#diff-916c262a44bdf2986e4e15b87f1be27cc247394b424104b94269e16edcc25b1cR288-R288) ```diff def waterfall_plot( amounts: list[float], labels: list[str], ... + if not all(isinstance(amount, (int, float)) for amount in amounts): + raise ValueError("All elements in 'amounts' must be numeric.") ```
    Suggestion importance[1-10]: 9 Why: Adding validation for numeric values in the `amounts` list is crucial to prevent runtime errors, ensuring robustness and reliability of the function.
    9
    Performance
    Use vectorized operations for filtering to enhance performance ___ **To enhance performance, consider filtering out zero amounts using vectorized
    operations instead of applying a lambda function, which can be slower for large
    datasets.** [pyretailscience/standard_graphs.py [331-332]](https://github.com/Data-Simply/pyretailscience/pull/63/files#diff-916c262a44bdf2986e4e15b87f1be27cc247394b424104b94269e16edcc25b1cR331-R332) ```diff -df = df[df["amounts"] != 0] +df = df[df.amounts.ne(0)] ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 7 Why: Using vectorized operations can improve performance, especially for large datasets, making the code more efficient.
    7
    Maintainability
    Refactor the function into smaller, more manageable parts ___ **To improve readability and maintainability, consider refactoring the waterfall_plot
    function by splitting it into smaller functions, such as prepare_data,
    configure_plot, and add_labels.** [pyretailscience/standard_graphs.py [288-396]](https://github.com/Data-Simply/pyretailscience/pull/63/files#diff-916c262a44bdf2986e4e15b87f1be27cc247394b424104b94269e16edcc25b1cR288-R396) ```diff +def prepare_data(amounts, labels, remove_zero_amounts): + ... +def configure_plot(df, display_net_bar, display_net_line, ax, **kwargs): + ... +def add_labels(ax, data_label_format, df, decimals): + ... def waterfall_plot( amounts: list[float], labels: list[str], ... + df = prepare_data(amounts, labels, remove_zero_amounts) + ax = configure_plot(df, display_net_bar, display_net_line, ax, **kwargs) + add_labels(ax, data_label_format, df, decimals) ```
    Suggestion importance[1-10]: 6 Why: Refactoring into smaller functions can enhance readability and maintainability, but it is not critical for functionality. It is a good practice for long-term code management.
    6
    codecov[bot] commented 4 months ago

    Codecov Report

    Attention: Patch coverage is 71.05263% with 11 lines in your changes missing coverage. Please review.

    Files Coverage Ξ”
    pyretailscience/style/graph_utils.py 82.08% <100.00%> (ΓΈ)
    pyretailscience/standard_graphs.py 43.30% <70.27%> (ΓΈ)