Open RayBB opened 6 months ago
Hi, I'd like to try and work on this issue, would it be possible to assign it to me?
@Jiatonglii22 this isn't marked as a good first issue because it requires a decent amount of work and isn't a high priority.
If you'd still like to do it you're welcome to but please start by writing some tests to get an idea of what would be required.
@RayBB thank you for letting me know, I'd still like to try, I'll be working on this with a team and we'll start with writing some tests first, thank you!
Alright you're assigned! Good luck @Jiatonglii22
Hi @RayBB, I've thought about the possible solution and have a implemented a hash function into the process_individual_sort function so that any sort without a specified custom seed will have the random seed generated from the carousel params.
I'm currently thinking about how to test this and was wondering if you have any insights? I'm thinking focusing on the outputs of the hash itself -- making sure it is fast, no collisions etc, but I think I need a better understanding of how carousel params might look like as a string. Looking at the past issue this issue is drawn from, is 'subject:('Reading Level-Grade 11' OR 'Reading Level-Grade 12') first_publish_year:[2000 TO *]' like an example of carousel params?
@Jiatonglii22 this is a good question and I think we'll need a staff member to chime in on performance testing. Generally, we aren't too worried about hash collisions because this is on a per page basis.
One way of testing it is to try to call the render function a few hundred times with and without the random seed to see if there is any discernible performance difference. Though, I suspect as long as the hash algo isn't a very intense one (you should justify which you use) it may not be noticeable.
In any case, if you have something that's already working I recommend putting up a draft PR so we can discuss based on that.
Describe the problem that you'd like solved
This issue is a follow-up to #8966 and addresses the need for unique randomness in carousel displays when multiple carousels are present on the same page. The current implementation, which uses
sort=random.hourly
, results in a shared random order across all carousels. This can lead to repetitive content display, especially when carousels share overlapping content.Scenario
Consider a scenario where two carousels on the same page display books with some overlap:
Using
sort=random.hourly
results in a shared random order across both carousels, which can lead to identical content appearing in the same order in both carousels. For example:This issue arises because the random seed is shared across all carousels, leading to shared content order.
Current Behavior
The current behavior does not allow for unique randomness within the same hour for each carousel. A workaround involves manually appending a custom seed to the sort parameter (e.g.,
sort=random.hourly_1
for Carousel 1 andsort=random.hourly_2
for Carousel 2), but this approach is counter-intuitive and requires manual intervention for each carousel.Desired Behavior
The desired behavior is to have a default randomness that is unique to each carousel on the same page. This means that when
sort=random.hourly
is used, each carousel should display its content in a different order, ensuring unique content display across carousels. If a specific seed is desired for shared randomness, it should be explicitly specified (e.g.,sort=random.hourly_1
for shared randomness).Proposal & Constraints
A proposed solution involves modifying the sorting logic to automatically generate a unique seed for each carousel when
sort=random.hourly
is used. This can be achieved by appending a hash of the carousel parameters to the sort parameter, ensuring that each carousel has a unique random order. The code snippet below illustrates this approach:This solution aims to simplify the process for librarians and patrons by eliminating the need for manual seed specification for unique randomness. This should lead to a more intuitive experience for librarians and as such better patron experience.
Constraints
When implementing this, we should prioritize keeping the code as simple as possible. We might need to refactor a few methods (words/editions/authors) to ensure that they can all accept the params to use as a seed.
Additional requirements:
Additional context
For further context see: https://github.com/internetarchive/openlibrary/pull/8966#issuecomment-2023233475
Stakeholders
@cdrini