collective / icalendar

icalendar parser library for Python
https://icalendar.readthedocs.io
Other
1k stars 172 forks source link

[Python, Beginner] Add `__all__` attributes to modules. #638

Closed niccokunzmann closed 4 months ago

niccokunzmann commented 5 months ago

The __all__ variable is used to show which attributes of a module are public. Please list all of the attributes that are imported in icalendar/__init__.py in their modules with the __all__ = [] module variable.

See also:

mohdyawars commented 5 months ago

@niccokunzmann Can I work on this issue?

niccokunzmann commented 5 months ago

@mohdyawars Yes please! Let us know if you need any help with it. You can create a pull request as soon as you have a change :) Thanks!

niccokunzmann commented 5 months ago

I am re-opening the issue because the description is not fully clear. I would like all modules in the icalendar package to have the __all__ variable for those functions that are intended to be shared as a public API.

devdanzin commented 5 months ago

Hi, I'd like to help a friend solve this issue.

How are we to know which functions/classes/constants are intended to be shared as a public API? From their names, where no leading underscore means public? From the docs?

niccokunzmann commented 5 months ago

@devdanzin Thanks for asking! This file here provides a good overview over what we want to share: https://github.com/collective/icalendar/blob/main/src/icalendar/__init__.py Generally, we want to share a lot in the public API.

Some functions that look internal or constants/variables that are just used inside - if you find them - you can ask or make a judgement.

Example of what is mostly used internally: https://github.com/collective/icalendar/blob/470485b1a0dd3a8f54f34eaed58e0553a7cfdfd2/src/icalendar/cal.py#L62

I am not totally clear on this at the moment but any help in creating a conversation around this and bringing clarity is much appreciated. It could also be that we might have forgotten the one or the other variable.

stevepiercy commented 4 months ago

How are we to know which functions/classes/constants are intended to be shared as a public API? From their names, where no leading underscore means public? From the docs?

If an object has a leading underscore in its name, then it is private. If it does not have a leading underscore, then it is probably public but possibly private. It's a matter of understanding its context.

Documentation and docstrings should not be used as absolute truth to make a decision, but it may help you make a better informed decision. Ultimately the code's functionality is the source of truth.


For how to document public members, I'd suggest the following.

We currently follow default conventions of Sphinx's autodoc extension for members.

By default, autodoc will not generate document for the members that are private, not having docstrings, inherited from super class, or special members. ... For modules, all will be respected when looking for members <snip>

There may be public members without docstrings. Sphinx will respect __all__ for modules, but that's not the primary purpose of using __all__, it's just a nice side effect. For documentation purposes of these members, we should still add a docstring.

If a variable is public, use autodata.

niccokunzmann commented 4 months ago

@stevepiercy That concerns documentation and I think, it is worth opening another issue to make sure that the documentation is checked. That could be a good first issue, too. To fix this one, and for the lovely first-time contributors we have, it is just nice to get the __all__ attribute into all of the modules. Thanks for your enthusiasm @devdanzin @c0d33ngr !

devdanzin commented 4 months ago

I am not totally clear on this at the moment but any help in creating a conversation around this and bringing clarity is much appreciated. It could also be that we might have forgotten the one or the other variable.

I've created a little script to help writing __all__ entries. It's not perfect and, given that there aren't too many entries to begin with, not too helpful. But it's an opportunity to show my friend that we can automate manual tasks like finding the entries and writing them in __all__.

Because he's a beginner, the code is as simple as possible and has a lot of comments. It is slightly configurable and you can tell us how you'd like the results the best. We will study the script, run it, check the results are correct and hopefully submit a PR.

"""
Help fixing https://github.com/collective/icalendar/issues/638 (Add __all__ attributes to modules).

This script will create preliminary lists of entries to include in __all__ for
each module in icalendar. These lists should be checked against source code,
for three reasons:
    - they might include entries that shouldn't be in __all__, but were
      imported in a line that didn't contain "import" (this happens, for
      example, in prop.py);
    - they might include entries that aren't meant to be in __all__
      because they are only for internal use;
    - they might not include entries that were imported, but should still
      be featured in __all__ (this happens, for example, in __init__.py).

To correctly run this script, you should install icalendar from the clone
of your fork (be sure it is up-to-date), because some modules might have
had __all__ added to them since the last release.
"""

import importlib  # Used to import modules by name
import pprint  # Used to format the __all__ list
from pathlib import Path  # Used to find files and read their contents

import icalendar  # The package we will analyze

# Configure whether __all__ will contain one entry per line or many entries per line
MANY_ENTRIES_PER_LINE = True
# Number of spaces to indent the __all__ entries with. Use 10 to keep the same style as __init__.py
INDENT = 11
# Whether to use double or single quotes, set to False to keep the same style as __init__.py
DOUBLE_QUOTES = True

# Get the path to the icalendar directory
icalendar_path = Path(icalendar.__file__).parent.absolute()

# Iterate through Python files in icalendar directory
for file in sorted(icalendar_path.iterdir()):
    if file.name.endswith('.py'):
        # Figure out module name
        module_name = file.name[:-3]
        # Import that module
        module = importlib.import_module(f"icalendar.{module_name}")
        # Check if module already has __all__ and skip it if it has
        if hasattr(module, "__all__"):
            print(f"Module icalendar.{module_name} already has __all__ attribute")
            continue
        # Collect candidates for inclusion in __all__
        candidates = []
        for attr in sorted(dir(module)):
            # Only add candidates if their name doesn't start with an underscore
            if not attr.startswith('_'):
                candidates.append(attr)
        # Remove imported names from candidate list
        source = file.read_text().splitlines()
        for line in source:
            if "import" in line:
                # Line has "import", remove "import", "from" and commas from it
                line = line.replace("import ", "").replace("from", "")
                line = line.replace(",", "")
                # Create a list from the remaining words in the line
                names = line.split()
                # Remove each name found in an import line if it's in the candidates list
                for name in names:
                    if name in candidates:
                        candidates.remove(name)
        print(f"\n# __all__ for module {file.name}:")
        # Generate a formatted list of candidates, eithe with one or many entries per line
        formatted_candidates = pprint.pformat(candidates, compact=MANY_ENTRIES_PER_LINE, indent=INDENT)
        if DOUBLE_QUOTES:
            # Format to use double quotes instead of single ones
            formatted_candidates = formatted_candidates.replace("'", '"')
        # Fix first line of the __all__ list, because there is an indent in it too that shouldn't be there
        formatted_candidates = formatted_candidates.replace("[          ", "[")
        print(f"__all__ = {formatted_candidates}")

Example output:

[...]
# __all__ for module cal.py:
__all__ = ["Alarm", "Calendar", "Component", "ComponentFactory", "Event",
           "FreeBusy", "INLINE", "Journal", "Timezone", "TimezoneDaylight",
           "TimezoneStandard", "Todo", "component_factory", "dateutil",
           "get_example", "types_factory"]

# __all__ for module caselessdict.py:
__all__ = ["CaselessDict", "canonsort_items", "canonsort_keys"]

# __all__ for module cli.py:
__all__ = ["main", "view"]
[...]
c0d33ngr commented 4 months ago

I'll write the PR to close the issue completely 😊

devdanzin commented 4 months ago

I'll write the PR to close the issue completely 😊

OK, we'll find some other thing to fix then. Have fun! :)

niccokunzmann commented 4 months ago

@c0d33ngr and @devdanzin, I am sure you can collaborate: Whoever wants to open a PR with a small change does it first. You can create Pull-Requests towards the other's PR branch to suggest edits. You can review each-other's work. So, there is no 'lock' on contribution. There is no wait on progress. Have fun and thanks for your interest! I do not assign issues to people who could potentially drop out without telling me and you are both in that category.

c0d33ngr commented 4 months ago

@c0d33ngr and @devdanzin, I am sure you can collaborate: Whoever wants to open a PR with a small change does it first. You can create Pull-Requests towards the other's PR branch to suggest edits. You can review each-other's work. So, there is no 'lock' on contribution. There is no wait on progress. Have fun and thanks for your interest! I do not assign issues to people who could potentially drop out without telling me and you are both in that category.

That's great!!😁 @devdanzin and I can collaborate to close the issue

niccokunzmann commented 4 months ago

@c0d33ngr @devdanzin Thanks for solving this!