Codycody31 / discuit-search

Community Searching for Discuit
https://discuit-search.codycody31.dev/
5 stars 0 forks source link

[Bug] Search Completeness #12

Open MarkMoretto opened 1 month ago

MarkMoretto commented 1 month ago

First, this is a lovely tool. Thank you for putting it togther!

Issue

The issue is about search completeness. For example, when querying the term "music," there are a lot of results that appear, but the Disc ObscureMusic is missing from the list.

Expected result

To see ObscureMusic in results list when querying "music."

Suggested fix

using the q=music param with /communities endpoint returns only 9 results.

music
BlackMusic
IdentifyMusic
ObscureMusic
VideoGameMusic
ElectronicMusic
ClassicalMusic
MusicFestivals
ProgMusic

Below is a solution in Python that returns 37 results (in lexicographical order).

from typing import Iterator, Optional
from requests import Response

# Create appropriate url.
BASE_URL = "https://discuit.net/api/"
COMMUNITY_URL = BASE_URL + "communities"

def check_get(url: str, *args, **kwargs) -> Optional[Response]:
    """Return requests.Response object or raise HTTP error."""
    from requests import get
    response = get(url,  *args, **kwargs)
    # Check and return response, if applicable.
    response.raise_for_status()
    return response

def get_all_discs(url: str = COMMUNITY_URL) -> dict:
    """Return list of Community objects."""
    resp = check_get(url, params={"set": "all"})
    return resp.json()

def generate_matches(keyword: str = "music", **kwargs: str) -> Iterator[str]:
    """Generate a list of matching Discs based on provided keyword param.  Matches
    are case insensitive and based on Disc name and about sections, if present.

    Parameters
    ----------
    kwargs
        are key-value pairs sent to requests.Request object.
    """
    for el in get_all_discs(**kwargs):
        # Handle cases where "about" may be null or None.
        disc_about = el.get("about") or ""
        # Concatenate strings and convert to lowercase since Python is case sensitive.
        disc_txt = (el.get("name") + disc_about).lower()
        # Check for presence of keyword.
        if keyword in disc_txt:
            yield el

if __name__ == "__main__":
    # View results
    names = "\n".join(map(lambda el: el.get("name"), generate_matches()))
    print(names)

Result output:

AiMusic
ASOT
blackmetal
BlackMusic
ClassicalMusic
DarkMusic
ElectronicMusic
GameDesign
Hamilton
IdentifyMusic
IndustrialMusic
JapaneseMusic
LiveStreams
Metal
Metalcore
Modular
music
MusicFestivals
Musician
ObscureMusic
piano
popheads
ProgMusic
Punk
RunningMusic
SilentHill
Sleepmusic
Songwriting
statsfm
swipefy
synthwave
TaiwaneseMusic
Treemusic
twinpeaks
Ukulele
vibes
VideoGameMusic
noClaps commented 1 month ago

We use Meilisearch for our searching algorithm, so the results that are returned are from that. I'd be tempted to rewrite the searching logic to include this, but seeing as there's already a PR for searching in the main Discuit repo, which means that searching capabilities will probably get added there soon, I'm a little less willing to put effort into this.

@Codycody31 Please test this with the current Discuit searching branch. If the same issue occurs then you might need to change something.

Codycody31 commented 1 month ago

I should be able to test this tomorrow. spitballing here, but most likely it needs to be tweaked so that the title has a higher priority than anything else, as most people are searching for the name of it rather than something in the description. If this is correct I believe the PR for searching will also have this bug

Codycody31 commented 1 month ago

Just did some testing, seems this bug is only on discuit-search and that integrated search doesn't have this issue