scientist-softserv / adventist_knapsack

Apache License 2.0
2 stars 0 forks source link

Fix "Sort by: Published Date" #185

Open KatharineV opened 6 months ago

KatharineV commented 6 months ago

Summary

The "Sort by Published Date" (both ascending and descending) feature in the catalog search and in collections does not properly sort based on the "Publication date" metadata field. If this feature is sorting based on a different date field, perhaps that would explain the problem, but if so, then we still would request work on this ticket, as explained in the last paragraph.

If the feature is meant to sort based on the "Publication date" metadata field, then we'd like the functionality to work when dates are entered in that field according to the system's requirements. In this example collection on ADL staging, we used the UI to enter publication date metadata, so we assume the system entered the date in a format that it prefers to read. The back end gave us a calendar widget, which we used. The dates appear properly in each work's metadata. However, when we use the sort by published date feature in the collection, the works do not sort in date order. See screenshots below.

Acceptance Criteria

Catalog search

Image

Collection show pg (admin view)

Image

Screenshots or Video

Proper date in work metadata: https://adl.s2.adventistdigitallibrary.org/concern/published_works/22255888_the_minnesota_worker_december_21_1898?locale=en

Image

Automatic relevant sorting in the collection:

Image

Sort By: Published Date (Ascending):

Image

Sort By: Published Date (Descending):

Image

These examples clearly show that the dates (which also appear in the work titles) are not sorting in proper order. They don't even change between ascending and descending. The behavior is the same in the catalog search, which offers a publication date sort.

If the feature is actually pulling from a different date field, we'd like to know which one is being referenced, what date format the system will recognize and use to sort properly, and we'd like the Sort By label to be renamed to match the metadata field being referenced.

As a sidenote only tangentially related to this ticket (and perhaps headed for its own ticket someday, or could be rolled into general Hyku upgrades of the future), the Publication Date metadata field seems like a bad choice for the catalog and collection search sort by feature. Not all works even have this metadata field in their work type. For example, Images. The sort by date feature would be better if it pulled a metadata field that appears in all work types, and it would be nice if there was a note in the UI that mentions to users that this is the field for date sorting. With soooooo many date fields across work types, it's very confusing what is used for search facets and sort by features etc. Additionally, the UI could contain a note regarding what date formats the system will actually recognize and use to sort. Is this info already included in Hyku documentation? See this screenshot of the back end of the UI "add new work" for the kind of note I'm suggesting would include format info and info about sort and facet dependency. Hope this makes sense.

Image

Testing Instructions

Testing note: the staging site was partially reindexed resulting in most existing records being updated but not all. Our assumption is that the cut over will handle the update of all works in production. Most records will pass QA.

Ensure that works & collections have "Created Date" filled in. If there is no Created Date, they will sort to the end of the list, whether sorting by ascending or descending.

Notes

This will require an update of data to remove ["0~"] from blank author (creator) fields.

christhepianist commented 6 months ago

[like] Christine Peterson reacted to your message:


From: KatharineV @.> Sent: Thursday, March 14, 2024 7:09:38 PM To: scientist-softserv/adventist-dl @.> Cc: Subscribed @.***> Subject: [scientist-softserv/adventist-dl] Fix "Sort by: Published Date" (Issue scientist-softserv/adventist_knapsack#185)

The "Sort by Published Date" (both ascending and descending) feature in the catalog search and in collections does not properly sort based on the "Publication date" metadata field. If this feature is sorting based on a different date field, perhaps that would explain the problem, but if so, then we still would request work on this ticket, as explained in the last paragraph.

If the feature is meant to sort based on the "Publication date" metadata field, then we'd like the functionality to work when dates are entered in that field according to the system's requirements. In this example collectionhttps://urldefense.proofpoint.com/v2/url?u=https-3A__adl.s2.adventistdigitallibrary.org_collections_53dd6c1a-2Debb4-2D420a-2Da286-2D29714d81bf9c-3Flocale-3Den&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=gUSrQfHklzzkNawlnTniDz4exITSKH3xCofodVUQqoY&e= on ADL staging, we used the UI to enter publication date metadata, so we assume the system entered the date in a format that it prefers to read. The back end gave us a calendar widget, which we used. The dates appear properly in each work's metadata. However, when we use the sort by published date feature in the collection, the works do not sort in date order. See screenshots below.

Proper date in work metadata: https://adl.s2.adventistdigitallibrary.org/concern/published_works/22255888_the_minnesota_worker_december_21_1898?locale=enhttps://urldefense.proofpoint.com/v2/url?u=https-3A__adl.s2.adventistdigitallibrary.org_concern_published-5Fworks_22255888-5Fthe-5Fminnesota-5Fworker-5Fdecember-5F21-5F1898-3Flocale-3Den&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=q-pEFvSzA36fDddE8PnSvmV_mqQXR9an20WQGTODR5Y&e=

image.png (view on web)https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_scientist-2Dsoftserv_adventist-2Ddl_assets_118196520_d0ac223b-2D14bf-2D4144-2Dba2b-2D2788900e6366&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=1HvU3bItW4H_FU4-UAK6ivvMTTxVl8uFKLSHDCBdMzI&e=

Automatic relevant sorting in the collection:

image.png (view on web)https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_scientist-2Dsoftserv_adventist-2Ddl_assets_118196520_e272f9db-2D5e76-2D4214-2D9c71-2Df172079ad024&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=1mBejc6ykngXU8hL0rTU4Ao5gBfVtBjLWkdOoRBnJ8E&e=

Sort By: Published Date (Ascending):

image.png (view on web)https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_scientist-2Dsoftserv_adventist-2Ddl_assets_118196520_922d4a67-2D553b-2D4912-2Db170-2D4872d6b0daeb&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=yVkxadX-5R_b-1fb569IeuXDJ_Xh0ZtRFvHo6x9xqaA&e=

Sort By: Published Date (Descending):

image.png (view on web)https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_scientist-2Dsoftserv_adventist-2Ddl_assets_118196520_f0d71f97-2Dc2b2-2D43be-2D82ca-2D8c86fd3aa03d&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=i5L_KAJX5Jn6zR4K8T8UaitG2E5iOyPJlvpGrS8Pnz8&e=

These examples clearly show that the dates (which also appear in the work titles) are not sorting in proper order. They don't even change between ascending and descending. The behavior is the same in the catalog search, which offers a publication date sort.

If the feature is actually pulling from a different date field, we'd like to know which one is being referenced, what date format the system will recognize and use to sort properly, and we'd like the Sort By label to be renamed to match the metadata field being referenced.

As a sidenote only tangentially related to this ticket (and perhaps headed for its own ticket someday, or could be rolled into general Hyku upgrades of the future), the Publication Date metadata field seems like a bad choice for the catalog and collection search sort by feature. Not all works even have this metadata field in their work type. For example, Images. The sort by date feature would be better if it pulled a metadata field that appears in all work types, and it would be nice if there was a note in the UI that mentions to users that this is the field for date sorting. With soooooo many date fields across work types, it's very confusing what is used for search facets and sort by features etc. Additionally, the UI could contain a note regarding what date formats the system will actually recognize and use to sort. Is this info already included in Hyku documentation? See this screenshot of the back end of the UI "add new work" for the kind of note I'm suggesting would include format info and info about sort and facet dependency. Hope this makes sense.

image.png (view on web)https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_scientist-2Dsoftserv_adventist-2Ddl_assets_118196520_868de6df-2D8dba-2D4b63-2D8729-2Dbeb43ef64c94&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=HgZcg7OYZRFd8JhrTwC4HP8SEguvb40VYzfvgh05oGg&e=

— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_scientist-2Dsoftserv_adventist-2Ddl_issues_740&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=HIrvhI9HKF5K2yM13kWxYJl0BTrGufyphRLQeRwviJc&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AC3AQ26FE4PLJCQXKV34WXTYYHYXFAVCNFSM6AAAAABEWWC6G2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGE4DOMBZGA4DONY&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=UnL802G5SdVV-32R3u-ZQCPy_3ItQxrnxxLobtXEX90&m=KOkNCLB4k2gU8k-2BWdLK2cjO68FQAlhkB0t64_JBo5nGbO2xCfkbkQAHpekHcy0&s=s79bLOGodzJWdTLdrrSm4BltIHGRakAzfi247uItSmg&e=. You are receiving this because you are subscribed to this thread.Message ID: @.***>

jillpe commented 5 months ago

This was done in pals for a different date (date created)

https://github.com/scientist-softserv/palni-palci/pull/867

KatharineV commented 5 months ago

Folks, the Sort By feature isn't working at all on a collection right now. When I created this ticket, we didn't have Knapsack. We are post Knapsack now, so I can't be sure if the cut over is responsible for the lack of functionality or if I overlooked and failed to test the other sort by options. Maybe it was always broken.

Here's the collection where I'm having trouble: https://adl.b2.adventistdigitallibrary.org/collections/b1cb4449-c0ee-4499-82da-32c1246daeea

Automatic relevance sort (which isn't sorted by any metric that I can determine):

Image

I select "Sort By: Upload Date (Descending)" and hit Refresh and this is what I get:

Image

I select "Sort By: Title" and hit Refresh and this is what I get:

Image

orangewolf commented 1 month ago

Sorting does work (you have to hit refresh after making change) but sort by title is backward and sort by dates could not be verified easily. Need to look at those

jillpe commented 3 weeks ago

This was fixed in UTK

sjproctor commented 2 weeks ago

We identified the source of the sorting problem is coming from Hyku. We are making the changes there first.

https://github.com/samvera/hyku/pull/2316/files

sjproctor commented 5 days ago
concerns = Hyrax.config.curation_concerns.map{ |c| "#{c}Resource".safe_constantize } << CollectionResource

concerns.each do |concern|
  works = Hyrax.query_service.find_all_of_model(model: concern)
  works.each do |work|
    next unless work['creator'] == ["0~"]
    work.creator = []
    work.save
  end
end
ShanaLMoore commented 3 days ago

QA Results: ❌ NEEDS REWORK

TLDR: sort by published date doesn't appear to be working for Collections. Kirk ran a staging reindex yesterday and in the tested example, I manually edit/saved each work to be sure it's not an indexing problem.

tested on STAGING

Testing note: the staging site was partially reindexed resulting in most existing records being updated but not all. Our assumption is that the cut over will handle the update of all works in production. Most records will pass QA.

CATALOG INDEX PAGE ✅

Image

COLLECTION PAGE ❌

tested on STAGING

Image

KatharineV commented 2 days ago

Team, as you're fixing this ticket, can you also educate me on which date field is used for the Sort By Published Date feature? There are so many date fields and they vary between work types. I have never been clear on which field maps to this feature, so your insight will be valuable.

ShanaLMoore commented 2 days ago

cc @sjproctor could you provide some insight here? ^^

sjproctor commented 2 days ago

Yeah, there are definitely a lot of date fields! We are using the date_created_ssi for publication date.

laritakr commented 2 days ago

This shows as "Date Created" on the work's metadata.

The input form uses a calendar entry that puts a date into mm/dd/yyyy format, but via bulkrax, I am seeing dates such as yyyy-mm-dd. Without a consistent format, we can't do a clean sort.

KatharineV commented 2 days ago

@laritakr Thank you! These are the instructions I needed. I've tried to find out before which field and which format the system prefers. The answer wasn't available when I asked the last couple times, so this is really great. I understand why the sort won't work unless we make revisions. We can default to mm/dd/yyyy, but (in case the community ever sits down to think about this) I would love to see modifications in future versions that allow for ISO standard dates, since mm/dd/yyyy is confusing across international boundaries.

Well, forget that for now, and thank you again!

laritakr commented 2 days ago

@laritakr Thank you! These are the instructions I needed. I've tried to find out before which field and which format the system prefers. The answer wasn't available when I asked the last couple times, so this is really great.

You're welcome! Glad I could add clarity. Based on what I have found, I am moving this ticket back to QA.

As an added note, I found that anything with no created date will sort at the end, whether using ascending OR descending.

We could put in some date standardization as we index the date used for sorting. This would require a reindex of all works & collections. If this is a desired change, you would need to make a ticket for it.

laritakr commented 2 days ago

QA Results: ✅ Passed

(previously passed catalog view QA)

COLLECTION PAGE ✅

tested on STAGING

Created Dates were added to several works - I added the dates into the title so the results are obvious.

Screenshot 2024-09-26 at 2 10 30 PM

Screenshot 2024-09-26 at 2 10 43 PM