opensupplyhub / open-apparel-registry

An application for searching, matching, uploading factories.
MIT License
32 stars 13 forks source link

Logic for ordering of data points #2129

Closed mariel-oar closed 1 year ago

mariel-oar commented 2 years ago

Overview

Data points displayed on facility profiles are currently ordered based on: verified status, claimed status, frequency of data point, and chronology. Data points that are inactive are included in the count for frequency. This ticket specifies requirements to update the logic that determines the order of data points so that the data displayed on facility pages is representative of what active contributors have listed.

Describe the solution you'd like

Must have:

  1. Ranking logic only considers data points that are part of active lists.

Nice to have: The logic to determine the order of all data fields will be prioritized according to:

  1. Claimed status - if a facility is claimed, the data points submitted by the facility owners (the claimers) is prioritized first
  2. Verified status - if a verified contributor has contributed data points on a given facility, those data points should appear below any contributed data points
  3. (NOTE a modification of the existing logic) Frequency of data point value - if a given data point is submitted numerous times, that data point should be ranked after claimed and verified data. Frequency must be determined only by ACTIVE data points. E.g. if "shirts" has been submitted 2x and both lists are active and "pants" has been submitted 2x but only one is from an active list, then "shirts" should be ranked above "pants."
  4. Chronological - Below data points that are ranked according to the logic above, data points should be ranked according to when they were submitted, with more recent data points ranks above older data points.
  5. (NOTE a new criteria) Active / Inactive - for inactive data points: (1) they should be displayed below all active data points and (2) they should be ranked according to the logic above within the 'inactive' section.

Additional Context

You can see on this facility CN20213488JQS3J that "Raw Material Apparel Inputs" has been submitted more recently than Accessories|Footwear|Apparel for Product Type; however, the latter has been submitted more frequently. Because inactive data points are included in the ranking, the inactive data points all appear above the recent and active one. The proposed logic in this ticket will address the issue with how active status should be taken into account for ranking.

TaiWilkin commented 2 years ago

Sort ordering: https://github.com/open-apparel-registry/open-apparel-registry/blob/f12efeadb6aafa05d56ea3f2ba257188fa71b566/src/django/api/serializers.py#L736 Serializing of data for sorting: https://github.com/open-apparel-registry/open-apparel-registry/blob/f12efeadb6aafa05d56ea3f2ba257188fa71b566/src/django/api/serializers.py#L1831

mariel-oar commented 1 year ago

would there be a way to make this a 1 pt ticket if we just asked that data from non active lists not factor into the ranking? @TaiWilkin @jwalgran

jwalgran commented 1 year ago

Tai an I agreed that the reduced scope would reduce this to a 1-point issue.

mariel-oar commented 1 year ago

OK, great. I updated the requirement above (the must have part, i want to keep the nice to have here so we can refer to this ticket in the future if we make additional changes)

I'm going to add this to PB @obrienad for next sprint. I have some tickets in mind that it could replace, so overall remaining points should stay the same.

mariel-oar commented 1 year ago

It looks like this is not working. For example, this facility: CN20213488JQS3J has the value "Accessories|Footwear|Apparel" contributed as product type 9x on the OAR; however, all 9 of the lists that that value was contributed on have since been replaced (made inactive). On their embedded map, that value is still showing up first instead of the data point that is from their active list "Raw Material Apparel Inputs".

The intention of this ticket was to make it so that inactive data points are ignored in the ranking logic (i.e. only data points from active lists should contribute to the frequency count ranking logic)

TaiWilkin commented 1 year ago

@mariel-oar When I click through from the top extended field to its source (on OSHub / the linked embedded map), it looks like it's coming from https://9f692df0338dcbc9848646c6.openapparel.org/admin/api/source/329863/change/, which is marked as active. That's from this list. It seems that the "Raw Material Apparel Inputs" hasn't been submitted to OSHub, so that's why it's not showing up there. @maurizi Was this applied to OAR, or just OSHub?

jwalgran commented 1 year ago

Thanks, Tai. Looking through the history of the issue I only see the single connected pull request (#2180) and that was merged into the OS Hub branch only.

@mariel-oar If it is important that this change is available on OAR as well we could make and review and additional PR to port this work to the OAR branch