Kanaries / pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
https://kanaries.net/pygwalker
Apache License 2.0
11.12k stars 579 forks source link

simplification of default parameter #401

Closed blondon1 closed 7 months ago

blondon1 commented 7 months ago

Simplification of default parameter to 'fieldSpecs', instead of setting 'fieldSpecs' to 'none' you can directly use a default empty dictionary if the function doesn't mutate the dictionary passed by the caller

jojocys commented 7 months ago

Hi @blondon1

Thanks for submitting this PR to improve the get_html_on_gradio function. I appreciate your efforts to enhance the codebase by adhering to Pythonic naming conventions and simplifying default parameter handling. Here are a few thoughts on your changes:

  1. Parameter Renaming: Renaming fieldSpecs to field_specs aligns well with the Python community's standard for snake_case naming. This is a welcome change for readability and consistency.

  2. Default Value Adjustment: Switching from None to an empty dictionary {} as a default value for field_specs eliminates the need for conditional logic later in the function. While this is generally a sleek approach, it's important to note that mutable default arguments can lead to potential bugs if the argument is modified within the function. If field_specs remains unmodified, this change is perfectly fine. If, however, it is modified, this could introduce state-related bugs. Can you confirm if field_specs is guaranteed to remain unmodified?

  3. Code Consistency: Since the parameter name has been changed, we need to ensure that all references to fieldSpecs within the function are updated accordingly. It seems there is still a reference to the old parameter name:

    field_specs=fieldSpecs if fieldSpecs is not None else {},

    This should be changed to:

    field_specs=field_specs,
  4. Documentation Updates: The docstring should reflect the new parameter name and its default behavior. The corrected docstring might read:

    - field_specs (Dict[str, FieldSpec], optional): Specifications of some fields. They'll be automatically inferred from `df` if some fields are not specified. Defaults to an empty dict.

I would suggest making these small but impactful updates to ensure the PR is in top shape. Let's make sure we also have suitable test cases to cover the new default behavior for field_specs.

Looking forward to your revisions, and I'm here if you have any questions or need further clarification!

longxiaofei commented 7 months ago

Hi @blondon1. Thanks again for your PR.

  1. using {} as default parameter is dangerous behavior, this will cause this parameter to be a globally shared dictionary.

  2. The parameter name fieldSpecs does not conform to Python code style, but this parameter has been used in earlier versions of pygwalker, so the current exposed APIs continue to use this parameter name.

Here are some TODO issues. If you're interested, you are welcome to contribute code.

https://github.com/orgs/Kanaries/projects/4

ObservedObserver commented 7 months ago

Considering the importance of maintaining backward compatibility and adhering to Pythonic naming conventions, here's an updated proposal that addresses the concerns and provides a clear deprecation path:

  1. Introduce New Parameter Names: We proceed with renaming fieldSpecs to field_specs and themeKey to theme_key. This aligns with Python's naming conventions and will improve code readability and maintainability.

  2. Maintain Backward Compatibility: For now, we keep the old parameter names operational but mark them as deprecated. When these parameters are used, we can issue a deprecation warning, informing users of the upcoming changes in future versions.

  3. Add Type Hint Comments: For the deprecated parameters, we'll add comments indicating that these are not recommended for use and will be removed in the next major release.

The function signature and body might look something like this:

from warnings import warn

def get_html_on_gradio(
    dataset: Union[DataFrame, Connector],
    gid: Union[int, str] = None,
    *,
    field_specs: Optional[Dict[str, FieldSpec]] = None,
    theme_key: Literal['vega', 'g2'] = 'g2',
    # ... other parameters ...
    # Deprecated parameters for backward compatibility
    fieldSpecs: Optional[Dict[str, FieldSpec]] = None,  # Deprecated: use field_specs
    themeKey: Literal['vega', 'g2'] = 'g2',  # Deprecated: use theme_key
    # ... additional deprecated parameters if any ...
) -> str:
    """
    ... existing docstring, updated to reflect new parameters and deprecation notes ...
    """
    if fieldSpecs is not None:
        warn("The 'fieldSpecs' parameter is deprecated; use 'field_specs' instead.", DeprecationWarning)
        field_specs = field_specs or fieldSpecs

    if themeKey != 'g2':
        warn("The 'themeKey' parameter is deprecated; use 'theme_key' instead.", DeprecationWarning)
        theme_key = theme_key or themeKey

    # ... rest of the function body ...

By issuing deprecation warnings, we gently nudge users toward the updated API while still supporting the old one until the next major version. This approach should allow for a smooth transition and give users ample time to update their code.

Since you're on the core team and familiar with the project's versioning and deprecation policies, could you guide the timeline and messaging for this deprecation? It would be helpful to include details in the release notes and documentation to ensure users are well informed.

Once again, thanks for your collaboration on refining the pygwalker codebase.