lemon24 / reader

A Python feed reader library.
https://reader.readthedocs.io
BSD 3-Clause "New" or "Revised" License
456 stars 38 forks source link

Am I interacting with this feed? #254

Open lemon24 opened 3 years ago

lemon24 commented 3 years ago

I'd like to know if I'm interacting with a feed, i.e.:

Open issues:

lemon24 commented 3 years ago

Rough order of things, based on what blocks what:

lemon24 commented 3 years ago

I don't know what other combinations mean, but this is enough to show "not read, but I don't care" == read and not important and important_changed is not None.

Using a new flag or an entry tag makes things more complicated, and there's no other meaning to associate with "marked as not important by hand, not by default" anyway (I think).

lemon24 commented 3 years ago

Regarding how we store timestamps:

We could do it with entry metadata (#253), but that's not implemented yet, and using it in queries would be a pain. It makes more sense to do the simplest thing possible now, and reconsider re-unification later.

The "simplest possible thing" seems to be to have another Entry attribute (and table column), so we'll do that.

lemon24 commented 3 years ago

The minimal API:

# Reader
mark_as_read(...)  # use the real "now", backwards compatible
mark_as_read(..., now: Optional[datetime])  # use a custom "now"; mainly for plug-ins
# TODO: find a better argument name than now... timestamp? last_modified?

class Entry:
    ...
    read_last_modified: Optional[datetime]

# Storage
mark_as_read_unread(..., now: Optional[datetime])

# idem for important

Notes:

lemon24 commented 3 years ago

To do (for storing stuff):

lemon24 commented 3 years ago

We're mostly done with the "store modified" part.

The UX for "don't care" needs a bit of work, though... In 4278939 (and 59004b3) I had to make the "unimportant"/"unread" buttons button set important_modified=None, because otherwise making a read, important entry not important makes it "don't care", which is not what we want (not all the time, anyway – sometimes you just want to undo a "mark as important").

... this tri-state of important (true, false and modified, false and not modified) is a bit confusing (or at least, the way we use it to infer another flag is) ... I should likely make a diagram to make more sense of it.

lemon24 commented 3 years ago

Spent about 12 hours on this until now.

lemon24 commented 1 year ago

We have conflicting interests, so it's likely time to address https://github.com/lemon24/reader/issues/254#issuecomment-938146589:

... this tri-state of important (true, false and modified, false and not modified) is a bit confusing (or at least, the way we use it to infer another flag is) ... I should likely make a diagram to make more sense of it.

Currently, "don't care" == read and not important and important_modified is not None.

Here are all the possible combinations and their meanings:

read read_modified important important_modified # interpretation
unread none unimportant none 1 initial state
unread date unimportant none 1 read then unread
unread * unimportant date 2 important then unimportant
read date unimportant date 1 don't care
read none unimportant date 1 don't care (unreachable from the web app)
read date unimportant none 1 read
read none unimportant none 1 read (pre-#254)
* * important * 8 important

Importantly, the interpretation is done outside of Reader, in the web app; because of this, there are two places where the web app needs to do extra handling:

The web app allows the following transitions (source):

Screenshot 2023-01-24 at 17 52 52

The conflicting interests are as follows:

On top of this, if the "don't care" logic implemented in the web app is useful, it should end up in Reader (we can probably repurpose the mark_entryas... methods for this, as part of #291). But, if it becomes stable, it should be easy to explain.

lemon24 commented 1 year ago

The current state as described above impossible to satisfy all the interests.

Adding a third flag for "don't care" would even complicate things further. Even worse, its meaning overlaps with unimportant with important_modified set. But...

We can translate the interests in terms of actual requirements:

Proposal 1: unimportant with modified == don't care

We can express user "don't care" as just unimportant with important_modified set. This is simple to explain ("user explicitly set unimportant"), and is not coupled to read in any way.

important important_modified interpretation
unimportant none never set
unimportant date don't care
important * important

For plugins, we can free up read-none by backfilling pre-#254 read to entry.added; for consistency, we should do the same for important.

This can also be explained more easily ("plugin explicitly set as read"), and is not coupled to important in any way.

(The unimportant-with-modified entries already marked by mark_as_read as user "don't care" are an acceptable loss.)

The full table becomes:

read read_modified important important_modified # interpretation
unread none unimportant none 1 initial state
unread date unimportant none 1 read then unread
read none unimportant none 1 plugin don't care
read date unimportant none 1 read
read none unimportant date 1 user don't care and plugin don't care
* * unimportant date 3 user don't care
read none important * 2 important and plugin don't care
* * important * 6 important

The UI logic becomes (a bit) simpler too, the tri-state involves only important:

if not important:
    important_button()
if important or important_modified:
    unimportant_button()  # modified=none
if not important_modified or important:
    dont_care_button()  # modified=date

This doesn't really clear things up from a Reader perspective, though.

lemon24 commented 1 year ago

Proposal 2: important: bool|None

Change important to be bool|None. Users set modified, plugins don't.

Pros:

Cons:

Backwards-compatible proposal for get_entries(important=...):

value predicate notes
True important works like before
False not important works like before
None True "all"; works like before
'unset' important is None needs better name?
'explicitly false' important is False "don't care"; needs better name!
'set' important is not None needs better name?
'not explicitly false' important is not False needs better name!

The last two aren't required, but it's still a good idea to find names for them.

Update: Here's a full typed proposal for the get_entries() API – instead of mixing bool|None and string literals, we have a set of literals, and map bools and None to some of the values (note BoolFilter is subset of OptionalBoolFilter):

BoolFilter = Literal[
    'true',  # value is True (equivalent to filter=True)
    'false',  # value is False (equivalent to filter=False)
    'all',  # equivalent to filter=None
    'default',  # equivalent to filter='all'
]

OptionalBoolFilter = Literal[
    'true',  # value is True (equivalent to filter=True)
    'false',  # value is False
    'unset',  # value is None
    'not true',  # equivalent to filter=False
    'not false',
    'all',  # equivalent to filter=None
    'default',  # equivalent to filter='all'
]

def get_entries(
    read: BoolFilter | bool | None = None,
    important: OptionalBoolFilter | bool | None = None,
) -> None: ...

Of course, we don't have to go with the full thing; initially, we can go with the minimum needed, and add more values later:

OptionalBoolFilter = Literal[
    'true',  # ~= True
    'false',  
    'unset',
    'not true',  # ~= False
    'all',  # ~= None
]

def get_entries(
    read: bool | None = None,
    important: OptionalBoolFilter | bool | None = None,
) -> None: ...

We could also model BoolFilter and OptionalBoolFilter as enums, but:

Update: EntryCounts should also likely be updated to count "important == None".

Update: The UI logic (note "unimportant" becomes "clear important" so it's harder to confuse with "don't care"):

if not important:             # False, None
    important_button()        # -> True
if important is not None:     # True, False
    clear_important_button()  # -> None
if important is not False:    # True, None
    dont_care_button()        # -> False
lemon24 commented 1 year ago

Going with proposal 2.

To do:

lemon24 commented 1 year ago

After a lot of deliberation, here's a naming scheme for get_entries(important: bool|None|TristateFilter):

Entry.important optional bool enum str
True True IS_TRUE istrue
False IS_FALSE isfalse
None NOT_SET notset
False, None False NOT_TRUE nottrue
True, None NOT_FALSE notfalse
True, False IS_SET isset
True, False, None None ANY any
True, False, None None DEFAULT default

Requirements:

Notes:

lemon24 commented 1 year ago

A quick note on what additional stats should look like.

While EntryCounts.averages provides a decent answer for e.g. all the entries in a feed, it does not provide the same answer for all the read/important/etc. entries (not without calling get_entry_counts() multiple times).

Furthermore, there are additional questions averages cannot easily answer, e.g. read but not important, or unimportant but not set by the user (no modified).

For maximum flexibility, we could instead provide a dataframe-like collection containing the values of all interesting fields for each entry (published, updated, added, read, read_modified, ...).

lemon24 commented 1 year ago

https://gist.github.com/lemon24/93222ef4bc4a775092b56546a6e6cd0f

Feed scoring algorithm (and how I consume feeds)

This is an attempt to use the metrics added in lemon24/reader#254 "Am I interacting with this feed?" to determine a feed "usefulness" score based on how many entries I mark as read / important / don't care.