feeluown / FeelUOwn

trying to be a robust, user-friendly and hackable music player
http://feeluown.readthedocs.io
GNU General Public License v3.0
3.54k stars 597 forks source link

功能请求:实现歌单内歌曲搜索 #865

Open mokurin000 opened 3 months ago

mokurin000 commented 3 months ago

简介与背景

如题,目前fuo未提供于ui或rpc接口( https://github.com/cosven/feeluownx 需要)搜索播放列表中特定歌曲的功能

鉴于直接进行关键词搜索较容易匹配失败,可能需要尝试使用模糊匹配算法

方案概述

经过一些尝试,推荐做法是以匹配结果的最多 k 个高可能结果按顺序返回(+ 数量可设置) 另外可以把字符串处理前统一为小写简体,后者依赖opencc

may block https://github.com/cosven/feeluownx/pull/4

另附:

from difflib import SequenceMatcher
from sys import argv
from typing import List
from collections.abc import Hashable

def std_distance(s1: str, s2: str):
    s = SequenceMatcher(isjunk=lambda c: c.isspace(), a=s1, b=s2)
    return s.ratio()

def distance(x: List[Hashable], y: List[Hashable]) -> float:
    wx = [tuple(x[i : i + 2]) for i in range(len(x) - 1)]
    wy = [tuple(y[i : i + 2]) for i in range(len(y) - 1)]

    nx = len(wx)
    ny = len(wy)

    hash_set = set(wx)

    length = 0

    for w in wy:
        if w in hash_set:
            length += 2

    if nx + ny == 0:
        return 0.0

    return length / (nx + ny)

if __name__ == "__main__":
    print(f'{"Std difflib":16}', std_distance(argv[1], argv[2]))
    print(f'{"Dice-Sorensen":16}', distance(argv[1], argv[2]))
cosven commented 3 months ago

好奇上述算法和标准库里面的这个是类似的么? https://docs.python.org/3/library/difflib.html#difflib.get_close_matches

mokurin000 commented 3 months ago

好奇上述算法和标准库里面的这个是类似的么? https://docs.python.org/3/library/difflib.html#difflib.get_close_matches

标准库的匹配算法看了下是: https://github.com/python/cpython/blob/05fc4d758aa9b731c722e6c1c18e1f5e54597393/Lib/difflib.py#L421-L490

看起来是不同算法,不过我还没测它的ratio,我试试

不过 difflib 里面是更复杂的匹配次数,最后还是 匹配次数*2/(两者长度和),Dice-Sorensen 里面的匹配就是2元素滑动窗口取集合,另一个滑动窗口迭代计数