Closed x1ao4 closed 7 months ago
Regarding the modification to the quote, I think we'd need something more robust than that. It's unlikely that Chinese characters are the only ones with issues.
This is the function we're currently using. https://github.com/squaresmile/Plex-Plug-Ins/blob/fc4ab34d4cb995668abd84b304b57c5bf13cb69d/Framework.bundle/Contents/Resources/Versions/2/Python/Framework/api/utilkit.py#L229
If we're going to move to a non framework method, we should probably use urllib3 instead of urllib.
Regarding why it doesn't work after you adjust the quoting, the reason is the results are returning in English, but your query is in Chinese.
for result in tmdb_data['results']:
if result['name'].lower() == search_query.lower() or \
'{} {}'.format(search_query.lower(), end_string).lower() == result['name'].lower():
collection_id = int(result['id'])
Your result['name].lower()
is james bond collection
, but your search_query
is 詹姆斯·邦德
(not sure what the lowercase version of that is, or if you even have lowercase letters in Chinese). So ultimately, the collection_id
is not being set.
@zdimension didn't you confirm that collections are working for you in French? Is Plex returning the TMDB results in French for you in this case?
we should probably use urllib3 instead of urllib
Does Plex's Python version support urllib3?
I tried adding a language parameter to the query_url
, such as &language=zh-CN
, but the TMDB data returned is still in English. Why is that? If we can set the search language, we should be able to retrieve the language setting from Plex's Library and then add it to the query_url
. However, even after adding the language parameter, the data returned is still in English, which confuses me.
Does Plex's Python version support urllib3?
We can add it as a dependency. It's already a sub dependency of requests
, so it's technically already included in the plugin.
I tried adding a language parameter to the query_url
This would be nice. I don't know if the service Plex provides for the TMDB lookup supports a language query. As far as I know, it's completely undocumented.
I have a note in the code:
# /search/collection?query=James%20Bond%20Collection&include_adult=false&language=en-US&page=1"
This is from the official TMDB api. I don't know if using the plex service, passes through every parameter or if it's sanitized somehow. http://127.0.0.1:32400/services/tmdb?uri=
is the base URL.
TMDB language reference: https://developer.themoviedb.org/docs/languages
Plex staff mentioned that it's not possible to add a language option for query_url = 'search/collection?query={}'
, which means it's not possible to retrieve collection information in languages other than English through Plex built-in API. If that's the case, it seems that non-English users will have to use their own TMDB API to obtain collection information in their preferred language.
Otherwise, the only option is to use the ID of the collection information that ranks first in the returned results as the collection ID. This way, language issues can be ignored, without using precise matching, but it may lead to matching errors in certain cases.
Unless I'm missing something, I think the service just passes through the entire query.
It would seem odd that Plex accepts language for this code, but parse it out for collections.
And with that, I got it to work.
You have to URL encode the &
in the TMDB query, otherwise Plex receives that as part of it's query.
http://127.0.0.1:32400/services/tmdb?uri=/search/collection?query=%E8%A9%B9%E5%A7%86%E6%96%AF%C2%B7%E9%82%A6%E5%BE%B7%EF%BC%88%E7%B3%BB%E5%88%97%EF%BC%89%26language=zh
or locally in a browser
http://127.0.0.1:32400/services/tmdb?X-Plex-Token=<your_token>&uri=/search/collection?query=%E8%A9%B9%E5%A7%86%E6%96%AF%C2%B7%E9%82%A6%E5%BE%B7%EF%BC%88%E7%B3%BB%E5%88%97%EF%BC%89%26language=zh
should both work. I tested in my browser and get the following:
{
"page": 1,
"results": [
{
"adult": false,
"backdrop_path": "/dOSECZImeyZldoq0ObieBE0lwie.jpg",
"id": 645,
"name": "詹姆斯·邦德(系列)",
"original_language": "en",
"original_name": "James Bond Collection",
"overview": "007是风é¡å…¨çƒçš„一系列è°æˆ˜ç”µå½±ï¼Œ007ä¸ä»…是影片的å称,更是主人公特工詹姆斯·邦德的代å·ã€‚詹姆斯·邦德(英è¯:James Bond)是一套å°è¯´å’Œç³»åˆ—电影的主角å称。å°è¯´åŽŸä½œè€…是英国作家伊æ©Â·ä½›èŽ±æ˜Žã€‚在故事里,邦德是英国情报机构军情å…处的间è°ï¼Œä»£å·007,被授予å¯ä»¥é™¤åŽ»ä»»ä½•å¦¨ç¢è¡ŒåŠ¨çš„人的æƒåŠ›ï¼Œæ¤å¤–,詹姆斯·邦德总是有美女相伴,那些女士被称为\"邦女郎\"。 他冷酷但多情,机智且勇敢,总能在最å±éš¾æ—¶åŒ–险为夷,也总能邂逅一段浪漫的爱情。历任007都是大帅哥,å†åŠ 上性感漂亮的邦女郎,以åŠæ‰£äººå¿ƒå¼¦çš„精彩剧情,让这部影片直至今天ä»è¢«å¹¿å¤§å½±è¿·æ‰€çƒçˆ±ã€‚ 第一部007电影于1962å¹´10月5æ—¥å…¬æ˜ åŽï¼Œ007电影系列风é¡å…¨çƒï¼Œåˆ°ä»Šå¤©åŽ†ç»äº”å余年长盛ä¸è¡°ã€‚",
"poster_path": "/oKDxj9E15x3DjSjl4TnSWVUaVSw.jpg"
}
],
"total_pages": 1,
"total_results": 1
}
Finally, I'd probably try to avoid talking about this plugin on the Plex forum. For sure we're doing stuff that they won't appreciate, by hacking their TMDB service among other things. They aren't too friendly to third party plugins these days.
You have to URL encode the
&
in the TMDB query, otherwise Plex receives that as part of it's query.
I changed the query_url
in tmdb_helper.py
to search/collection?query={}%26language=zh
, and it indeed returned Chinese data. However, the Chinese characters appeared in Unicode encoding format. I suspect this format might still prevent the retrieval of IDs, and it may be necessary to convert the retrieved data to UTF-8 encoding to match non-English text.
2024-03-17 23:26:28,902 (700004eca000) : DEBUG (tmdb_helper:123) - TMDB data: {'total_results': 1, 'total_pages': 1, 'page': 1, 'results': [{'poster_path': '/d83LVydlQonKdshwQyLYx48D3LH.jpg', 'name': u'\u7231\u5ba0\u5927\u673a\u5bc6\uff08\u7cfb\u5217\uff09', 'overview': u'\u8bb2\u8ff0\u4e86\u5728\u7ebd\u7ea6\u4e00\u5e62\u70ed\u95f9\u7684\u516c\u5bd3\u5927\u697c\u91cc\uff0c\u6709\u4e00\u7fa4\u5ba0\u7269\uff0c\u6bcf\u5929\u4e3b\u4eba\u51fa\u95e8\u540e\u3001\u56de\u5bb6\u524d\u8fd9\u91cc\u5c31\u53d8\u6210\u4e86\u5b83\u4eec\u7684\u4e50\u56ed\uff1a\u6709\u7684\u548c\u5176\u4ed6\u5ba0\u7269\u4e00\u8d77\u51fa\u53bb\u73a9\uff1b\u6709\u7684\u805a\u5728\u4e00\u8d77\u4ea4\u6d41\u4e3b\u4eba\u7684\u7cd7\u4e8b\uff1b\u8fd8\u6709\u7684\u5728\u4e0d\u505c\u636f\u996c\u81ea\u5df1\u7684\u5916\u8c8c\uff0c\u4f7f\u81ea\u5df1\u770b\u4e0a\u53bb\u66f4\u53ef\u7231\u4ee5\u4fbf\u4ece\u4e3b\u4eba\u90a3\u91cc\u8981\u6765\u66f4\u591a\u7684\u96f6\u98df\u2026\u2026\u603b\u4e4b\uff0c\u5ba0\u7269\u4eec\u6bcf\u5929\u7684\u201c\u671d\u4e5d\u665a\u4e94\u201d\u662f\u4ed6\u4eec\u4e00\u5929\u4e2d\u6700\u81ea\u7531\u3001\u6700\u60ec\u610f\u7684\u65f6\u5149\u3002 \u3000\u3000\u5728\u8fd9\u7fa4\u5ba0\u7269\u4e2d\uff0c\u6709\u4e00\u53ea\u5c0f\u730e\u72ac\u662f\u5f53\u4ec1\u4e0d\u8ba9\u7684\u9886\u8896\uff0c\u4ed6\u53eb\u9ea6\u514b\u65af\uff08Max\uff09\uff0c\u673a\u667a\u53ef\u7231\uff0c\u81ea\u8ba4\u4e3a\u662f\u5973\u4e3b\u4eba\u751f\u6d3b\u7684\u4e2d\u5fc3\u2014\u2014\u76f4\u5230\u5979\u4ece\u5916\u5e26\u56de\u5bb6\u4e00\u53ea\u61d2\u6563\u3001\u6ca1\u6709\u5bb6\u6559\u7684\u6742\u79cd\u72d7\u201c\u516c\u7235\u201d\uff08Duke\uff09\u3002 \u3000\u3000\u9ea6\u514b\u65af\u548c\u516c\u7235\u4eba\u751f\u89c2\u4ef7\u503c\u89c2\u90fd\u4e0d\u4e00\u6837\uff0c\u81ea\u7136\u5f88\u96be\u548c\u5e73\u5171\u5904\u3002\u4f46\u5f53\u5b83\u4eec\u4e00\u8d77\u6d41\u843d\u7ebd\u7ea6\u8857\u5934\u540e\uff0c\u4e24\u4eba\u53c8\u5fc5\u987b\u629b\u5f03\u5206\u6b67\u3001\u5171\u540c\u963b\u6b62\u4e00\u53ea\u88ab\u4e3b\u4eba\u629b\u5f03\u7684\u5ba0\u7269\u5154\u201c\u96ea\u7403\u201d\uff08Snowball\uff09\u2014\u2014\u540e\u8005\u4e3a\u4e86\u62a5\u590d\u4eba\u7c7b\uff0c\u51c6\u5907\u7ec4\u7ec7\u4e00\u652f\u906d\u5f03\u5ba0\u7269\u5927\u519b\u5728\u665a\u996d\u524d\u5411\u4eba\u7c7b\u53d1\u8d77\u603b\u653b\u2026\u2026', 'original_name': 'The Secret Life of Pets Collection', 'backdrop_path': '/fAibj0DIT8gk5jQtsEor66QKCsR.jpg', 'adult': False, 'id': 427084, 'original_language': 'en'}]}
I'd probably try to avoid talking about this plugin on the Plex forum
Understood, I got it.
Could you try this build? https://github.com/LizardByte/Themerr-plex/actions/runs/8316419576?pr=395
So far, I didn't do anything special to handle the unicode... but I suspect the framework may handle that automatically.
I tested it, and the returned data is still in Unicode encoding. It seems like I didn't retrieve the theme song for the collection. If it were successful, what message should appear in the log?
2024-03-18 00:19:10,080 (700010ffe000) : DEBUG (tmdb_helper:117) - TMDB data: {'total_results': 1, 'total_pages': 1, 'page': 1, 'results': [{'poster_path': '/r6ujhctKtNVfxdj8DNs0gDdMkjN.jpg', 'name': u'\u8d85\u51e1\u8718\u86db\u4fa0\uff08\u7cfb\u5217\uff09', 'overview': u'\u300a\u8d85\u51e1\u8718\u86db\u4fa0\u300b\uff08\u7cfb\u5217\uff09\u6539\u7f16\u81ea\u6f2b\u5a01\u8d85\u7ea7\u82f1\u96c4\u6f2b\u753b\uff0c\u7531\u9a6c\u514b\xb7\u97e6\u5e03\u6267\u5bfc\uff0c\u5b89\u5fb7\u9c81\xb7\u52a0\u83f2\u5c14\u5fb7\uff0c\u827e\u739b\xb7\u65af\u901a\uff0c\u745e\u65af\xb7\u4f0a\u51e1\u65af\uff0c\u9a6c\u4e01\xb7\u8f9b\uff0c\u838e\u8389\xb7\u83f2\u5c14\u5fb7\u7b49\u4e3b\u6f14\u3002\u300a\u8d85\u51e1\u8718\u86db\u4fa0\u300b\uff08\u7cfb\u5217\uff09\u4e0d\u540c\u4e8e\u6b64\u524d\u5c71\u59c6\xb7\u96f7\u7c73\u6267\u5bfc\u7684\u300a\u8718\u86db\u4fa0\u300b\u4e09\u90e8\u66f2\uff0c\u6b64\u90e8\u5c06\u89c6\u89d2\u62c9\u56de\u5230\u5f7c\u5f97\xb7\u5e15\u514b\u7684\u9ad8\u4e2d\u65f6\u4ee3\uff0c\u5e74\u8f7b\u7684\u4ed6\u4e00\u65b9\u9762\u8981\u540c\u81ea\u5df1\u7684\u521d\u604b\u683c\u6e29\u5171\u540c\u7ecf\u5386\u7231\u60c5\u627f\u8bfa\u7684\u8003\u9a8c\uff0c\u53e6\u4e00\u65b9\u9762\u8fd8\u8981\u63ed\u5f00\u53cc\u4eb2\u795e\u79d8\u5931\u8e2a\u7684\u771f\u76f8\uff0c\u5728\u4eba\u751f\u6700\u5927\u7684\u6311\u6218\u4e2d\u5b8c\u6210\u4ece\u5e38\u4eba\u5230\u82f1\u96c4\u7684\u547d\u8fd0\u8f6c\u53d8\u3002', 'original_name': 'The Amazing Spider-Man Collection', 'backdrop_path': '/yFGBYtzbvSKKI5qSvyUBWeq1uiJ.jpg', 'adult': False, 'id': 125574, 'original_language': 'en'}]}
Okay, I think we just need to modify this part of the code now.
if result['name'].lower() == search_query.lower() or \
'{} {}'.format(search_query.lower(), end_string).lower() == result['name'].lower():
collection_id = int(result['id'])
I don't know how to get them to match though.
I added some code to print the comparison content between the search query and the returned collection name on the console. The code is as follows:
for result in tmdb_data['results']:
comparison1 = result['name'].lower()
comparison2 = '{} {}'.format(search_query.lower(), end_string).lower()
Log.Debug('Comparing: {} and {}'.format(comparison1, comparison2)) # 添加的日志输出
if comparison1 == comparison2:
collection_id = int(result['id'])
I found that one of the values includes the language suffix
and the collection
in the comparison, which caused the matching to fail. The log is as follows:
2024-03-18 01:53:55,667 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 安娜贝尔(系列) and 安娜贝尔(系列)&language=zh-cn collection
2024-03-18 01:53:55,681 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 宝贝老板(系列) and 宝贝老板(系列)&language=zh-cn collection
2024-03-18 01:53:55,693 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 比得兔(系列) and 比得兔(系列)&language=zh-cn collection
2024-03-18 01:53:55,706 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠:黑暗骑士(系列) and 蝙蝠侠:黑暗骑士(系列)&language=zh-cn collection
2024-03-18 01:53:55,706 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠:黑暗骑士归来(系列) and 蝙蝠侠:黑暗骑士(系列)&language=zh-cn collection
2024-03-18 01:53:55,718 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,718 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 新蝙蝠侠(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,718 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 未来蝙蝠侠(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,718 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠之子(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,718 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠无极限(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,718 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠:黑暗骑士(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,719 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠动画宇宙(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,719 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠:漫长的万圣节(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,719 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 超人与蝙蝠侠动画(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,719 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠:黑暗骑士归来(系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,719 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 蝙蝠侠(亚当·韦斯特动画系列) and 蝙蝠侠(系列)&language=zh-cn collection
2024-03-18 01:53:55,731 (700006dd8000) : DEBUG (tmdb_helper:124) - Comparing: 边境杀手(系列) and 边境杀手(系列)&language=zh-cn collection
I'm not sure at which step the &language=zh-cn collection
part was added. They need to be removed from the comparison value before being compared, then the matching should succeed.
I'm not sure why you defined end_string
and added it to the comparison. It seems like we don't need end_string
.
end_string = 'Collection' # collection names on themoviedb end with 'Collection'
After removing end_string
, only &language=zh-cn
remains, which is included in the search_query
. It needs to be removed from the search_query
before comparison.
You might be adding "Collection" to Plex collection titles that don't already contain it to match the collection titles on TMDB. However, besides English, I'm not sure if other languages also use the "Collection" suffix. In my case, my collection titles already include the Chinese version of "Collection" (系列), so there's no need to add "Collection" as an end_string
. You may need to consider the situation in other languages.
>>> get_tmdb_id_from_collection(search_query='James Bond')
645
I'm not sure why you defined end_string and added it to the comparison. It seems like we don't need end_string.
It's because in english the collections end with that on TMDB. They may or may not do that on Plex, depending on the agent that is used. The new movie agent will just use "James Bond" for example, but the legacy agents will use "James Bond Collection".
They need to be removed from the comparison value before being compared
Good catch. I made a small adjustment to strip the query back to the search term, and use that for the comparison.
Describe the Bug
When using the original version of
tmdb_helper.py
, Chinese titled collections cannot retrieve corresponding TMDB data. Here's an example of the log:After testing, I found that the issue lies with URL encoding. Since collections are searched and matched on TMDB using their titles, the original script uses the
String.Quote
function which utilizes Unicode escape sequence URL encoding. For example, "黑夜传说(系列)" would be encoded as:I made some modifications to the script. Now, when the title is in Chinese, it uses the
urllib.quote
function with UTF-8 encoding for URL encoding. For titles in other languages, the script still uses theString.Quote
function. For example, "黑夜传说(系列)" would now be encoded as:The modified
tmdb_helper.py
is as follows:After the modification, TMDB data for collections with Chinese titles can now be retrieved. For example:
However, collections with Chinese titles still cannot fetch theme songs, while collections with English titles can. For example, I have one collection titled "James Bond Collection" and another titled "詹姆斯·邦德(系列)" in two separate libraries. The collection with the English title successfully fetched the theme song, but the one with the Chinese title did not.
I'm not sure if there are notifications in the logs for fetching theme songs for collections because I haven't seen any "data found for collection" notifications while monitoring the logs. It might be challenging to filter logs based on this. However, in the WebUI, I noticed that only collections in the English library fetched theme songs, while collections in the Chinese library did not. I hope we can find the reason for this discrepancy.
Furthermore, even though we've retrieved data from TMDB, the collection IDs in the logs still show as None. Is this normal? Also, in the WebUI, it displays as "No known ID," despite some collections having successfully matched data from TMDB.
I'm puzzled as well. Both collections with Chinese and English titles have successfully matched with TMDB. So, it's unclear why only collections with English titles are fetching theme songs. There might be a specific issue or limitation in the retrieval process that's causing this discrepancy. It could be worth investigating further to understand the root cause.
How does Themerr search and match collections in ThemerrDB? Is the issue possibly occurring here?
Expected Behavior
No response
Additional Context
No response