beetbox / pyacoustid

Python bindings for Chromaprint acoustic fingerprinting and the Acoustid Web service
MIT License
325 stars 66 forks source link

Getting less data back when using multiple meta values #63

Closed lshw42 closed 3 years ago

lshw42 commented 3 years ago

Expected Behaviour

Calling acoustid.match() or acoustid.lookup() with 'meta': 'recordings+releasegroups+sources' leads to a response like the following:

{
    "results": [{
        "id": "376e5e6d-b175-4e52-a545-a8889c41bb9b",
        "recordings": [{
            "artists": [{
                "id": "5bae7081-64ef-4473-825a-38d310deb14c",
                "name": "Will Smith"
            }],
            "duration": 224,
            "id": "28c09ca6-1f7b-4861-b205-0b41abcb1722",
            "releasegroups": [{
                "artists": [{
                    "id": "89ad4ac3-39f7-470e-963a-56509c546377",
                    "name": "Various Artists"
                }],
                "id": "f5f5145f-9008-48b4-9a74-584a4f80c433",
                "secondarytypes": ["Compilation"],
                "title": "90 Les Ann\u00e9es Cultes",
                "type": "Album"
            }],
            "sources": 1,
            "title": "Men In Black"
        }],
        "score": 0.853853
    }],
    "status": "ok"
}

Current Behaviour

Calling acoustid.match() or acoustid.lookup() with 'meta': 'recordings+releasegroups+sources' leads to a response like the following:

{
    "results": [{
        "id": "376e5e6d-b175-4e52-a545-a8889c41bb9b",
        "score": 0.853853
    }],
    "status": "ok"
}

Steps to Reproduce

  1. Install current version of pyacoustid
  2. Start Python Interpreter
  3. Do the following:
    >>> import acoustid
    >>> fingerprint = acoustid.fingerprint_file(<path>)
    >>> acoustid.lookup(<apikey>, fingerprint[1], fingerprint[0], "recordings+releasegroups+sources")

    Please replace <path> and <apikey> accordingly.

    Context (Environment

    Python 3.8.10 (default, Jun 2 2021, 10:49:15) pyacoustid 1.2.0

    Detailed Description

    response = session.post(url, data=params, headers=headers, timeout=timeout) does url-encode params which leads to the problem, that 'meta': 'recordings+releasegroups+sources' becomes meta=recordings%2Breleasegroups%2Bsources instead of meta=recordings+releasegroups+sources. This breaks the AcoustID Web Service and it ignores the meta parameter.

    Possible Implementation

    To solve this safe the + by using e.g. urllib.parse.urlencode(params, safe='+').

    response = session.post(url,
    data=urllib.parse.urlencode(params, safe='+'),
    headers=headers,
    timeout=timeout)
sampsyo commented 3 years ago

Nice! Thanks for the detailed diagnosis. Marking + as safe for the encoding seems OK to me, or perhaps an even cleaner way would be to allow meta to be a list (and then build the query string ourselves)? In any case, I'd definitely merge a PR along these lines!

lshw42 commented 3 years ago

Thank you for your responds. My given solution is something quick-and-dirty, which worked while researching the problem. Making meta to be a list seem pretty good to me. That way, the representation of the meta parameter in the later URL is decoupled from input. Also, this would prevent any confusion with the delimiter.

sampsyo commented 3 years ago

Fixed in #64.