lemon24 / reader

A Python feed reader library.
https://reader.readthedocs.io
BSD 3-Clause "New" or "Revised" License
438 stars 36 forks source link

Consider merging tags and metadata #266

Closed lemon24 closed 2 years ago

lemon24 commented 2 years ago

Two main reasons:

First, there are 7 methods for metadata/tags. If we add entry and global metadata/tags, there may be 21 (if we actually want to add methods for them is debatable).

Second, there are two namespaces, and some functionality overlaps – to filter by metadata, you have to have an identical tag; for example, the dedupe plugin may need to do this:

for feed in feeds with the .dedupe tag:
    metadata = get .dedupe metadata for feed
    dedupe based on options in metadata

... the issue is not necessarily that you have to maintain both, but that you have to do or not do something when one is missing (and document that etc.).

Third, we need to maintain two very similar implementations.

Finally, AWS does it, and they spent far more time thinking about it.


Here's an example of how it might work (omitting the reader object and the first feed argument).

For reference, tags are conceptually set[str], and metadata is dict[str, JSONType].

Currently:

>>> set_feed_metadata_item('one', {})
>>> add_feed_tag('two')
>>> get_feed_metadata()
{'one': {}}
>>> get_feed_tags()
{'two'}

With the same namespace (to do: is this backwards compatible?):

>>> set_feed_metadata_item('one', {})
>>> add_feed_tag('two')
>>> get_feed_metadata()
{'one': {}, 'two': None}
>>> get_feed_tags()
{'one', 'two'}

At this point we could either leave both sets of methods, or unify them (to do: what would it look like?).

FWIW, the boto3 EC2 client has create_tags(), delete_tags(), describe_tags() (which can filter).

lemon24 commented 2 years ago

OK, I decided going forward with this is the Right Thing™.

Unifying the namespaces should happen as soon as possible, so as few people as possible depend on them being separated. This does not break the API in any way, but does change behavior.

Unifying the methods we can postpone for a bit; we can only remove old methods in 3.0, after deprecating them in 2.x.

lemon24 commented 2 years ago

To do for unifying the namespaces:

lemon24 commented 2 years ago

Here are all the tag and metadata methods as of 2.6:

get_feed_metadata(feed[, key]) -> (key, value), ...
get_feed_metadata_item(feed, key[, default]) -> value # raises FeedMetadataNotFoundError
set_feed_metadata_item(feed, key, value) # raises FeedNotFoundError
delete_feed_metadata_item(feed, key) # raises FeedMetadataNotFoundError

get_feed_tags([feed]) -> tag, ...
add_feed_tag(feed, tag) # raises FeedNotFoundError
remove_feed_tag(feed, tag)

Note:

lemon24 commented 2 years ago

Here's what a unified feed tag API would look like:

get_feed_tags(feed[, key]) -> (key, value), ...
get_feed_tag_keys([feed]) -> key, ...
get_feed_tag(feed, key[, default]) -> value  # raises FeedTagNotFoundError
set_feed_tag(feed, key[, value])  # raises FeedNotFoundError
delete_feed_tag(feed, key)  # raises FeedTagNotFoundError

# optional, can be added later (or never)
describe_feed_tags([feed[, key]]) -> TagDescription(feed, key, value), ...

To do: What would this look like if we were to generalize it to different types of resources, per #228 (global, feed, entry)? The following would have to be generic:

lemon24 commented 2 years ago

This is what generic global/feed/entry tag methods would look like:

class ResourceNotFoundError(ReaderError):
    object_id: tuple[str, ...]
class FeedError(ReaderError):
    object_id: tuple[str]
class FeedNotFoundError(ResourceNotFoundError, FeedError):
    object_id: tuple[str]

# same as above for entries    
class EntryNotFoundError(EntryError, ResourceNotFoundError)

class TagError(ReaderError):
    tag: str
    # optional, can be added later (or never)
    object_id: tuple[str, ...]
class TagNotFoundError(TagError)

# optional, can be added later (or never)
class FeedTagNotFoundError(FeedError, TagNotFoundError)
class EntryTagNotFoundError(FeedError, TagNotFoundError)
class GlobalTagNotFoundError(FeedError, TagNotFoundError)

get_tags(resource, *, key=None) -> (key, value), ...
get_tag_keys(resource) -> key, ...
get_tag(resource, key, default=MISSING, /) -> value  # raises TagNotFoundError
set_tag(resource, key, value=None, /)  # raises specific ResourceNotFoundError
delete_tag(resource, key, /, missing_ok=False)  # raises TagNotFoundError

# types of resource arguments
EntryInput = EntryLike | tuple[str, str]
FeedInput = FeedLike | tuple[str] | str
GlobalInput = tuple[]
AnyEntryInput = EntryInput | tuple[str, None] | tuple[None, None]
AnyFeedInput = FeedInput | tuple[None]
AnyInput = AnyEntryInput | AnyFeedInput | None
lemon24 commented 2 years ago

To do: