Globalize logging and log everywhere

preritdas commented 3 months ago

Closes #57.

Why

This PR implements a global logging infrastructure to address the following issues:

Logging was previously built into the BigDBMClient class, making it difficult for other classes to log without a client instance.
Only the Processor class (a non-client class) could log, using self.client.logfire.log.
Classes like deliverers and analyzers that don't need a client instance couldn't log internally.

How

Created a new internal_logging.py module with a LoggingManager class and global logging functions.
Implemented a consistent pattern across base classes (BaseAnalyzer, BaseOutputDeliverer, BaseProcessor, BaseValidator):
- Public methods (e.g., analyze, deliver, process, validate) in base classes now include logging.
- Abstract private methods (e.g., _analyze, _deliver, _process, _validate) to be implemented by subclasses.
Updated all relevant subclasses to use the new structure.
Enhanced logging throughout the codebase, particularly in the delivery and analysis components.

Impact

Observations

More consistent and detailed logging across the entire project.
Improved ability to debug and monitor processes across all components.
Centralized control over logging behavior.

Client Code Changes

No changes are required in client code. However, users of the package now have more control over logging:

from bigdbm.internal_logging import enable_logging, disable_logging, log, log_span

# Enable or disable logging globally
enable_logging()
disable_logging()

# Use logging functions directly if needed
log('debug', 'This is a debug message')

# Use log_span for more detailed logging of operations
with log_span('Operation Name', _level='debug'):
    # Perform operation
    pass

Enabling and disabling logging via this package as seen above is a convenience. It should be done with logfire.configure(...) in client code instead, as you can pass in False to the console= parameter or custom ConsoleOptions at the global level.

enable_logging() and disable_logging() both do not call logfire.configure(...), they simply access the GLOBAL_CONFIG LogfireConfig instance of the logfire package and adjust the .console: ConsoleOptions | Literal[False] attribute.

Interacting with internal_logging

The internal_logging module provides the following key functions:

enable_logging(): Enables logging if Logfire is available.
disable_logging(): Disables logging.
log(level: str, message: str, **kwargs): Logs a message at the specified level.
log_span(name: str, **kwargs): Creates a logging span for a specific operation.

These functions can be used to control logging behavior and add custom logs when necessary.

Code Snippets

Before (in BigDBMClient):

self.logfire.log("trace", "Updated access token.")

After (using global logging):

from bigdbm.internal_logging import log

log("trace", "Updated access token.")

New Base Class Structure (e.g., BaseAnalyzer):

class BaseAnalyzer(ABC):
    def analyze(self, pii_md5s: list[MD5WithPII]) -> str:
        with log_span(f"Analyzing {len(pii_md5s)} MD5s with {self.__class__.__name__}", _level="debug"):
            log("debug", f"Starting analysis with {self.__class__.__name__}")
            result = self._analyze(pii_md5s)
            log("debug", f"Analysis completed with {self.__class__.__name__}")
            return result

    @abstractmethod
    def _analyze(self, pii_md5s: list[MD5WithPII]) -> str:
        pass

This new structure ensures consistent logging across all implementations while allowing subclasses to focus on their specific logic.

Summary by CodeRabbit

New Features
- Enhanced logging functionality added to various classes, improving traceability during the analysis and delivery processes.
- Introduction of abstract methods in several classes to better encapsulate processing and validation logic.
Bug Fixes
- Improved error handling and logging mechanisms throughout the codebase for better debugging and user feedback.
Refactor
- Method names changed to indicate internal usage, promoting better encapsulation and reducing external access.
- Simplified logging approach by consolidating to an internal logging system.
Documentation
- Updates made to test files to reflect changes in logging configuration and method visibility.
- Version constraint for the logfire package updated to ensure compatibility with newer features.

coderabbitai[bot] commented 3 months ago

Walkthrough

The recent changes introduce a global logging infrastructure, enhancing logging capabilities across various classes without requiring each class to manage logging individually. The BaseAnalyzer, BaseOutputDeliverer, and other classes have been updated to utilize this new logging system. Additionally, several methods have been renamed to indicate internal use, and abstract methods have been introduced for better encapsulation of core logic.

Changes

Files	Change Summary
`bigdbm/analyze/base.py`, `bigdbm/analyze/insights/__init__.py`, `bigdbm/deliver/base.py`, `bigdbm/deliver/csv.py`, `bigdbm/deliver/followupboss/ai_fields.py`, `bigdbm/deliver/followupboss/vanilla.py`, `bigdbm/process/base.py`, `bigdbm/process/fill.py`, `bigdbm/process/simple.py`, `bigdbm/validate/base.py`, `bigdbm/validate/email.py`, `bigdbm/validate/home_attrs.py`, `bigdbm/validate/phone.py`, `bigdbm/validate/pii.py`, `bigdbm/validate/simple.py`	Enhanced logging functionality and method renaming to indicate internal use across multiple classes. New abstract methods introduced for core logic encapsulation.
`bigdbm/client.py`	Removed the `logfire` logging library and replaced it with an internal logging system.
`bigdbm/internal_logging.py`	Introduced a new global logging infrastructure that handles logging conditionally based on the presence of the `logfire` library.
`requirements.txt`	Updated the minimum version constraint for the `logfire` package from `0.50.1` to `0.51.0`.
`tests/test_analyze.py`, `tests/test_client.py`	Updated test method names to reflect changes in method visibility and modified instantiation of the `BigDBMClient`.

Sequence Diagram(s)

sequenceDiagram
    participant Logger
    participant Client
    participant Analyzer
    participant Deliverer
    participant Processor

    Client->>Logger: Enable logging
    Client->>Analyzer: Call analyze()
    Analyzer->>Logger: Log start of analysis
    Analyzer->>Deliverer: Call deliver()
    Deliverer->>Logger: Log delivery process
    Deliverer->>Processor: Call process()
    Processor->>Logger: Log processing details
    Logger->>Client: Log output

Assessment against linked issues

Objective	Addressed	Explanation
Global logging infrastructure (#57)	✅
Simplified logging management in classes	✅
Elimination of client dependency for logging	✅

Poem

🐰 In the burrow where logs now flow,
A rabbit hops with joy, you know!
With each new change, our paths align,
No more tangled logs, all is fine.
So let us dance in the meadow bright,
For clean code brings us pure delight! 🌼

Tips

### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` - `@coderabbitai help me debug CodeRabbit configuration file.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai full review` to do a full review from scratch and review all the files again. - `@coderabbitai summary` to regenerate the summary of the PR. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configuration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

preritdas commented 3 months ago

@coderabbitai full review

coderabbitai[bot] commented 3 months ago

Actions performed

Full review triggered.

preritdas commented 3 months ago

@preritdas Review all changed files again before merging.

preritdas commented 3 months ago

@coderabbitai In the AI Follow Up Boss deliverer, if adding a person fails due to a custom lead having the wrong type for example, what happens? Does the entire delivery process for all leads fall back to the non-AI implementation? Does it fail for only that one person? Does it fail for only that custom field?

coderabbitai[bot] commented 3 months ago

[!TIP] For best results, initiate chat on the files or code changes.

@preritdas The _deliver method in the AI Follow Up Boss deliverer is designed to handle errors gracefully. If adding a person fails due to a custom lead having the wrong type (or any other error during the AI-based delivery process), the method falls back to the parent class's delivery method, which does not use AI fields. This fallback applies to the entire delivery process for all leads, meaning that if the AI-based delivery fails, the method will attempt to deliver all leads using the non-AI implementation.

Let me know if you need further clarification!

Analysis chain

---

Scripts executed

The following scripts were executed for the analysis: Script: ```shell #!/bin/bash # Search for the `_deliver` method in the `bigdbm/deliver/followupboss/ai_fields.py` file to analyze its error handling logic. ast-grep --lang python --pattern 'def _deliver($_, $_):' --context 10 ``` Length of output: 71 --- Script: ```shell #!/bin/bash # Refine the search for the `_deliver` method in the `bigdbm/deliver/followupboss/ai_fields.py` file to analyze its error handling logic. rg 'def _deliver' -A 20 bigdbm/deliver/followupboss/ai_fields.py ``` Length of output: 939

preritdas commented 3 months ago

Tip

For best results, initiate chat on the files or code changes.

@preritdas The _deliver method in the AI Follow Up Boss deliverer is designed to handle errors gracefully. If adding a person fails due to a custom lead having the wrong type (or any other error during the AI-based delivery process), the method falls back to the parent class's delivery method, which does not use AI fields. This fallback applies to the entire delivery process for all leads, meaning that if the AI-based delivery fails, the method will attempt to deliver all leads using the non-AI implementation.

Let me know if you need further clarification!

Analysis chain Scripts executed

Can you suggest a commitable change where we'd fallback only for one person if that person fails as opposed to the entire delivery of all people/leads?

Standard-Labs / real-intent