ViennaRSS / vienna-rss

Vienna is a free and open-source RSS/Atom newsreader for macOS.
https://www.vienna-rss.com
Apache License 2.0
1.85k stars 227 forks source link

Add "Date Published" sorting option and table column #1770

Closed Eitot closed 2 months ago

Eitot commented 4 months ago

As suggested in #1749 the createddate SQL field is repurposed to show a "Date Published" option in the Sort By menu and add a new column for the horizontal layout. The existing "Date" sort option and column is renamed to "Last Update" to reflect its current purpose better. createddate was previously set to a current date when Vienna added the entry to the database. To keep it simple, I have not added the "Date Published" field to the vertical layout, since that would require more elaborate changes to the table-view cell.

While working on this I also stumbled on two bugs that caused the sort indicator in the column header not to show on app launch and when the user changes the sorting via the main menu. Both are fixed. The former bug is fixed with a band aid however, since the underlying cause is that ArticleListView is initialised before ArticleController, the latter which holds the property that determines the sorting order.

Eitot commented 4 months ago

Now we have "date published" and "date added" which is really confusing in my opinion. Isn´t "publishing" the action of first releasing the original text of an article? I would rather call the two cases "Publication date" and "Last update".

Last update would be wrong though, because "createddate" in the database is created once and never changed.

The problem with the "date" field is that it means something different depending on the feed type and whether a date is provided by the feed. For RSS it means "date published", for Atom it means either "date created", "date modified" or "date updated" – whichever is latest – and for JSONFeed it currently means "date modified". If no date is given, Vienna will fall back to a date at the feed level (which also differs per feed type) or – if no date at all is provided – the date when the feed was fetched by Vienna.

I think another database column would give more flexibility, to distinguish publication date from modification date (both of which could still be filled in by Vienna). Also, the "createddate" column really ought to be updated too when an existing entry is modified.

TAKeanice commented 4 months ago

Last update would be wrong though, because "createddate" in the database is created once and never changed.

I meant createddate to be "publication date", which misses the point just a little (it is the date when Vienna first loaded the article), but offers the highest level of distinction to the other date value

The problem with the "date" field is that (...) For RSS it means "date published", for Atom it means either "date created", "date modified" or "date updated" – whichever is latest – and for JSONFeed it currently means "date modified"

Yes, for RSS there is no update date offering a distinction from the creation date. However, it rather reflects the last update than its publication date:

https://github.com/ViennaRSS/vienna-rss/blob/95f120b81e52e1082fababa6e8db6094896ff6de/Vienna/Sources/Parsing/RSSFeed.m#L136-L140

and

NSDate * articleDate = newsItem.modifiedDate;
// (...)             
article.date = articleDate;

in https://github.com/ViennaRSS/vienna-rss/blob/95f120b81e52e1082fababa6e8db6094896ff6de/Vienna/Sources/Fetching/RefreshManager.m#L767-L850

In conclusion, I am not 100% convinced of the naming either, but I would opt for the most distinguishable instead of the most accurate terms. The wording will be of no importance to the users who keep their settings as before, for them the change will be "Date" -> "Last update". The users who have problematic feeds that are updated in retrospective, they will have the option to change to "Publication date" which might not be totally accurate but have the desired effect, to keep their sorting in the order they originally received the articles.

One refinement may be to not set the article "createddate" to the date where the article was received, if the article itself contains a date, but to the first value we encounter for the date of the article, and only use the date of receiving the article as fallback. Alternatively, we could introduce an additional field in the feed item that we set to the actual publication date in the feed types that provide it, and to the current date for the feed types that don´t, subsequently setting article.publicationDate to that value, taking that date as createddate on the first insert into the database. That suggestion would result in code like this in -[Database addArticle: toFolder:]:

    // We set the publication date ourselves if it is not contained in the feed
    article.publicationDate = article.publicationDate ?: [NSDate date];
    NSTimeInterval publicationIntervalSince1970 = article.publicationDate.timeIntervalSince1970;

instead of the current way of setting the addedDate ourselves calculating the createdInterval from it.

TAKeanice commented 4 months ago

Please see https://github.com/TAKeanice/vienna-rss/tree/feature/date-added-field for a draft to implement my suggestion, for some reason I don´t seem to be able to open a PR to your repository @Eitot

Eitot commented 4 months ago

Yes, for RSS there is no update date offering a distinction from the creation date. However, it rather reflects the last update than its publication date:

That is for the date of the feed (channel). The feed item's date is set by pubDate or dc:date: https://github.com/ViennaRSS/vienna-rss/blob/95f120b81e52e1082fababa6e8db6094896ff6de/Vienna/Sources/Parsing/RSSFeed.m#L258-L263

In conclusion, I am not 100% convinced of the naming either, but I would opt for the most distinguishable instead of the most accurate terms. The wording will be of no importance to the users who keep their settings as before, for them the change will be "Date" -> "Last update". The users who have problematic feeds that are updated in retrospective, they will have the option to change to "Publication date" which might not be totally accurate but have the desired effect, to keep their sorting in the order they originally received the articles.

One refinement may be to not set the article "createddate" to the date where the article was received, if the article itself contains a date, but to the first value we encounter for the date of the article, and only use the date of receiving the article as fallback. Alternatively, we could introduce an additional field in the feed item that we set to the actual publication date in the feed types that provide it, and to the current date for the feed types that don´t, subsequently setting article.publicationDate to that value, taking that date as createddate on the first insert into the database. That suggestion would result in code like this in -[Database addArticle: toFolder:]

What about RSS feeds though? Would "last update" and "created date" not be the same, if in the case of RSS they are likely both based on pubDate?

TAKeanice commented 4 months ago

I think if there is no distinction like for RSS, either name for the date is fine. For all cases where there is a distinction, the "date" in article is the last updated date, so we should name it to reflect that. Please check the code in my implementation draft for reference

TAKeanice commented 4 months ago

I cleaned up some confusion that I had yesterday, now the code does what I intended it to do

barijaona commented 4 months ago

I am OK with "last update date". I find "publication date" quite weird, as in my mind, a publication date is decided by the author / publisher of the article, and not set by us. How about "delivery date" ?

TAKeanice commented 4 months ago

I am OK with "last update date".

I find "publication date" quite weird, as in my mind, a publication date is decided by the author / publisher of the article, and not set by us.

Agree with you there, but we get into a distinction problem again: for JSON Feeds the publication date may actually be set by the author, and for atom feeds we at least have a chance to catch the publication date. Only RSS Feeds don't have it. We may set the first encounter of the article date in an RSS feed to the publication date, though. @Eitot feel free to change that at the comment line I added in the RSS feed class.

How about "delivery date" ?

Too technical in my eyes. "Received date" or "Reception" maybe ("Empfangsdatum" in German) if you don't like "Publication date" at all

TAKeanice commented 4 months ago

@TAKeanice I have incorporated your changes, with some changes

Thank you, I appreciate that you added the thoroughness my original changes lacked

However, one thing remaining is the "last refresh" filter. This is probably broken now

Another thing I suspect to be broken now is the intelligent folder feature - I don't remember how much it relies on the names of fields and which constants it uses for that. The persisted filters might fail if they contain one of the changed fields. Additionally, I think an option for filtering the "Publication / Added / ... " date should be included.

TAKeanice commented 4 months ago

Another thing I suspect to be broken now is the intelligent folder feature - I don't remember how much it relies on the names of fields and which constants it uses for that.

The predicate editor seems fine at the first glance.

I found at least one place where there's now inconsistency though, in SmartFolder.m in some predicate templates the hardcoded String @"Date" is used, and in other places the constant for the Date field which is now renamed to MA_LastUpdatedField. We should probably use the constant for all predicate templates.

barijaona commented 4 months ago

“Reception date” sounds good to me.

TAKeanice commented 3 months ago

I added "Date published" to the intelligent folder predicates on my branch: https://github.com/TAKeanice/vienna-rss/tree/feature/date-added-field https://github.com/ViennaRSS/vienna-rss/pull/1770#issuecomment-2242018445 is not incorporated yet to keep it in sync with the current state of this PR.

Eitot commented 3 months ago

I am not sure about "reception date" (or "date received") as a substitute for "publication date" (or "date published"). Date received is the date when Vienna received the article from the source. That is not what this field will represent after this PR: it changes the implementation so that the value of that field will be the earliest date provided by the feed, if one is available (created, updated, modified for Atom , whichever is earliest, and date_published for JSONFeed). Also, date received implies a "last update" date too, which is not what this field is (even though the "last refresh" filter makes it seem that way).

I still see it as a challenge that we are attempting to reconcile different meanings of "date", that are inconsistent among the three feed types, in two fields. Recall that what gave rise to the PR is the problem that Vienna changes the visible date of an article even when the user knows that the article is older (#1749). To accommodate this, we sacrifice the "last refresh" filter.

TAKeanice commented 3 months ago

@barijaona I cleaned up the places you mentioned and made the logic as robust as I could. @Eitot i don´t know why I can´t push to your branch, this used to be possible when ticking "allow edits and access to secrets by maintainers" during the creation of a PR, but maybe I am doing something wrong. Instead, I opened a PR to your branch: https://github.com/Eitot/Vienna/pull/7

barijaona commented 3 months ago

… I don´t know why I can´t push to your branch, this used to be possible when ticking "allow edits and access to secrets by maintainers" during the creation of a PR

@TAKeanice , would you try again pushing to this branch ? For some reason, you were visible in the @ViennaRSS/maintainers group, without having the maintainer role…

TAKeanice commented 3 months ago

Thanks for pushing my commits @Eitot . I need to check and test the criteria implementation, if I add any additional commits for that I will try pushing directly again.

barijaona commented 2 months ago

I have :