Open grqz opened 1 month ago
Unfortunately, the http://antiserver.kuwo.cn/anti.s
API cannot extract the lossless format now.
By capturing packets, I found an API with no params related to the music's format, it just returns the 128kbps mp3 and requires a signature
the site says that the lossless format is only playable on their client. I don't plan to capture packets from the client
After some testing, it is possible to bypass the geo-block.
even though the webpage returns HTTP/1.1 500 OK
it still returns the Set-Cookie header which is enough.
None of the APIs are geo-blocked
it's just that we can't extract metadata from the webpage
EDIT: metadata extraction from webpage may be removed
then the Secret
header calculation is inevitable since the metadata API needs it
After some work:
explanation:
]+id="lrcName">([^<]+)
', webpage, 'song name') - singer_name = remove_start(self._html_search_regex( - r']+href="http://www\.kuwo\.cn/artist/content\?name=([^"]+)">', - webpage, 'singer name', fatal=False), '歌手') - lrc_content = clean_html(get_element_by_id('lrcContent', webpage)) - if lrc_content == '暂无': # indicates no lyrics - lrc_content = None - - formats = self._get_formats(song_id) - - album_id = self._html_search_regex( - r']+href="http://www\.kuwo\.cn/album/(\d+)/"', - webpage, 'album id', fatal=False) - publish_time = None - if album_id is not None: - album_info_page = self._download_webpage( - f'http://www.kuwo.cn/album/{album_id}/', song_id, - note='Download album detail info', - errnote='Unable to get album detail info') + _ = self._download_webpage(url, song_id, headers=headers, fatal=False) # get cookies + if not self._get_cookies('http://www.kuwo.cn/play_detail/').get('Hm_Iuvt_cdb524f42f23cer9b268564v7y735ewrq2324'): + raise ExtractorError('Failed to get cookies from the webpage!', video_id=song_id) + subtitles = self._get_subtitles(song_id) + metadata = self._get_metadata(song_id) + # comments = self._get_comments(song_id) + # if metadata.get('msg') != 'success' and webpage: + # self.report_warning('metadata API failed, falling back to webpage', song_id) + # # window.__NUXT__.data[0].songinfo + # self._search_nextjs_data() + # self._search_json( + # r'') + # song_name = self._html_search_regex( + # r''',\s*?name\s*?:\s*?(['"])(?Phaven't tested whether several other IEs are working. I'll probably open a pr later
DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
Checklist
Region
China
Provide a description that is worded well enough to be understood
Example URL
http://kuwo.cn/play_detail/28115171
Description
the current code is 8 years old. it needs an update
I have a VPN to debug though I'm not in china. I have no kuwo account so please ask someone else if necessary.
the verbose output below from the latest nightly shows that:
Related info
Related to pr: #7470
EDIT: there may be easier way to extract than changing the API, the current code fails on metadata extraction.
http://antiserver.kuwo.cn/anti.s
is still usable with some param changessince the URL path has changed, _VALID_URL should be updated.
playurl API URL:
http://www.kuwo.cn/api/v1/www/music/playUrl
query:{'mid': song_id}
not sure how to extract a specific format, the API just gives 128k.
required headers in API request:
Hm_Iuvt_cdb524f42f23cer9b268564v7y735ewrq2324
inCookies
Secret
theSecret
header seems to be related toHm_Iuvt_cdb524f42f23cer9b268564v7y735ewrq2324
value. but it has nothing to do with the query param(mid
) haven't tested if they'll expire yet.EDIT: the header
Secret
can be obtained by the func below with:f(<unescaped Hm_Iuvt_cdb524f42f23cer9b268564v7y735ewrq2324 value>, "Hm_Iuvt_cdb524f42f23cer9b268564v7y735ewrq2324")
python algorithm:
Provide verbose output that clearly demonstrates the problem
yt-dlp -vU <your command line>
)'verbose': True
toYoutubeDL
params instead[debug] Command-line config
) and insert it belowComplete Verbose Output