pdoc3 / pdoc

:snake: :arrow_right: :scroll: Auto-generate API documentation for Python projects
https://pdoc3.github.io/pdoc/
GNU Affero General Public License v3.0
1.12k stars 145 forks source link

pdoc generating warnings for externally installed packages #312

Closed zjpiazza closed 3 years ago

zjpiazza commented 3 years ago

Expected Behavior

pdoc should only generate docs for classes and methods in the context of the current project

Actual Behavior

pdoc gives warnings related to externally installed classes

Steps to Reproduce

  1. pdoc [modulename] -o docs

Additional info

kernc commented 3 years ago

Can you confirm it actually does generate docs only for the projects/packages specified on the command line?

Since pdoc imports the packages it documents, there's little we can do in case those packages import other (third-party, installed) packages that raise warnings ...

zjpiazza commented 3 years ago

Okay so the module in question has one class called HCDConsumer, which is basically a high level wrapper class for the confluent_kafka Consumer class to perform some additional error handling and other functions. When I run pdoc to generate the docs, it throws a bunch of warnings related to the classes in that confluent_kafka package, but it completes successfully. It not only outputs documentation on my HCDConsumer class, but also the base confluent_kafka Consumer class which is not behavior I want. The weird thing is, this generating of external classes only seems to be happening with that confluent package.

consumer.py

import json
from logging import Logger
from dataclasses import asdict

from confluent_kafka import Consumer, KafkaException, Message
from dacite import Config, from_dict
from dacite.exceptions import DaciteError

from .catalog_registry import CatalogRegistry
from .messages import ConsumableHCDMessage
from .exceptions import HCDMessagingException, SchemaValidationError

class HCDConsumer:
    """
    HCD specific Kafka Consumer that enforces message contracts
    """

    def __init__(
        self,
        *,
        kafka_consumer: Consumer,
        catalog_registry: CatalogRegistry,
        logger: Logger
    ):
        """
        Initialize an instance of HCDConsumer

        :type kafka_consumer: Consumer
        :param kafka_consumer: Instance of Kafka Consumer
        :type logger: Logger
        :param logger: Instance of Logger
        to the appropriate Kafka topic
        """

        self.logger = logger
        self.kafka_consumer = kafka_consumer
        self.catalog_registry = catalog_registry

    def consume(
        self,
        raise_on_failure: bool = False
    ) -> ConsumableHCDMessage:
        """
        Main polling loop for provisioners.

        (1) Continuously polls Kafka broker for new messages. (2) Once one has been received, validates the received
        message conforms to the expected dataclass schema. (3) If the message is valid, returns that request object
        along with an associated Kafka Message instance

        :return: Returns tuple of HCDRequest instance and associated Kafka Message object
        :rtype: ConsumableHCDMessage
        """
        self.logger.info("Polling for new requests")
        while True:
            try:
                msg = self._poll()
                data_class = self.catalog_registry.get_dataclass_by_topic(msg.topic())
                # noinspection PyArgumentList
                data = json.loads(msg.value().decode("utf-8"))
                data['kafka_message'] = msg
                request = from_dict(
                    config=Config(strict=True),
                    data_class=data_class,
                    data=data,
                )
                self.logger.info(f"Received request: {asdict(request)}")
                return request
            except KafkaException as e:
                raise HCDMessagingException('Encountered error when reading Kafka message') from e
            except (json.JSONDecodeError, DaciteError) as e:
                error_msg = "Incoming message did not conform to expected schema"
                if raise_on_failure:
                    raise SchemaValidationError(error_msg)
                else:
                    self.logger.error(error_msg)
                    self.logger.error(e)

    def _poll(self) -> Message:
        """
        Infinite polling to check for new requests

        :return: Kafka Message instance
        :rtype: Message
        """
        while True:
            msg = self.kafka_consumer.poll()
            if not msg:
                continue
            if msg.error():
                raise KafkaException(msg.error())
            return msg

    def commit(self, msg: ConsumableHCDMessage):
        """
        Calls the commit method of the underlying Kafka Consumer instance.

        :param msg: Kafka Message instance to commit
        :rtype: None
        """
        try:
            offsets = self.kafka_consumer.commit(msg.kafka_message, asynchronous=False)
            self.logger.info(f"Committed offsets: {offsets}")
        except KafkaException as e:
            raise HCDMessagingException(
                "Encountered an error trying to commit message"
            ) from e

    def close(self):
        """
        Cleanly close the broker connection
        """
        self.kafka_consumer.close()
zjpiazza commented 3 years ago

Damn I answered my own question. I need to define the __all__ variable for the module to prevent that behavior. Thanks for the fast response!

kernc commented 3 years ago

Yeah, either __all__ or setting an equivalent of the following should work:

__pdoc__ = {i: False for i in 'Consumer KafkaException Message'.split()}

It not only outputs documentation on my HCDConsumer class, but also the base confluent_kafka Consumer class which is not behavior I want. The weird thing is, this generating of external classes only seems to be happening with that confluent package.

I assume confluent_kafka is a compiled/binary/C-extension package? That would make the issue a duplicate of https://github.com/pdoc3/pdoc/issues/307.

zjpiazza commented 3 years ago

Yeap exactly. It's a wrapper around the confluent c/c++ library.