openzim / python-scraperlib

Collection of Python code to re-use across Python-based scrapers
GNU General Public License v3.0
18 stars 16 forks source link

Type/Behavior mismatch for "Tags" parameter of Creator.config_metadata #125

Closed IMayBeABitShy closed 4 months ago

IMayBeABitShy commented 5 months ago

Currently, the method signature of Creator.config_metadata is as follows:

def config_metadata(
        self,
        *,
        Name: str,
        Language: str,
        Title: str,
        Description: str,
        LongDescription: Optional[str] = None,
        Creator: str,
        Publisher: str,
        Date: Union[datetime.datetime, datetime.date, str],
        Illustration_48x48_at_1: bytes,
        Tags: Optional[Union[Iterable[str], str]] = None,
        Scraper: Optional[str] = None,
        Flavour: Optional[str] = None,
        Source: Optional[str] = None,
        License: Optional[str] = None,
        Relation: Optional[str] = None,
        **extras: str,
    ):

The line Tags: Optional[Union[Iterable[str], str]] = None, implies that Tags may be either a string or a list of strings. Yet, there exists no code that handles the case where Tags is a list of strings rather than a single string, resulting in python-libzim raising an exception:

File "libzim/libzim.pyx", line 583, in libzim.Creator.add_metadata
TypeError: Argument 'content' has incorrect type (expected bytes, got list)

Potential fixes:

As you can see, this is a really minor issue and I'd simply open a pull request with the fix, but I am unsure whether the type annotation is right and the code wrong or vice versa.

rgaudin commented 5 months ago

I think the intent is good: we want to handle both here because both can be handy in a scraper context. You PR adding support for it would be welcome