ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
132.41k stars 10.04k forks source link

[myspace] Add support for artist pages #11854

Open erikxcore opened 7 years ago

erikxcore commented 7 years ago

What is the purpose of your issue?


youtube-dl --extract-audio --audio-format mp3 -v --batch-file=batch.txt [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: [u'--extract-audio', u'--audio-format', u'mp3', u'-v', u'--batch-file=batch.txt'] [debug] Batch file urls: [u'https://myspace.com/redlightgreenlight123/music/songs/', u'https://myspace.com/loveyoumaidthebutcher/', u'https://myspace.com/knifetheglitter/', u'https://myspace.com/myonlyescapex/'] [debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2017.01.25 [debug] Python version 2.7.9 - Linux-4.0.0-040000-generic-x86_64-with-Ubuntu-15.04-vivid [debug] exe versions: avconv 11.2-6, avprobe 11.2-6, rtmpdump 2.4 [debug] Proxy map: {} [generic] songs: Requesting header WARNING: Falling back on generic information extractor. [generic] songs: Downloading webpage [generic] songs: Extracting information ERROR: Unsupported URL: https://myspace.com/redlightgreenlight123/music/songs/ Traceback (most recent call last): File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1711, in _real_extract doc = compat_etree_fromstring(webpage.encode('utf-8')) File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2526, in compat_etree_fromstring doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory))) File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2515, in _XML parser.feed(text) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed self._raiseerror(v) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror raise err ParseError: mismatched tag: line 71, column 2 Traceback (most recent call last): File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 694, in extract_info ie_result = ie.extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 359, in extract return self._real_extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2551, in _real_extract raise UnsupportedError(url) UnsupportedError: Unsupported URL: https://myspace.com/redlightgreenlight123/music/songs/


Just trying to pull down some songs from local bands that no longer have their music published. This is using the 'new' MySpace, and occurs when using just the profile URL, /music, and /music/songs. Basically, any MySpace URL doesn't seem to work. I'm using the following URLs: https://myspace.com/redlightgreenlight123/music/songs/ https://myspace.com/loveyoumaidthebutcher/ https://myspace.com/knifetheglitter/ https://myspace.com/myonlyescapex/

Edit : I do want to point out, it seems that when youtube-dl attempts to detect headers it's not picking up this is Myspace and tries parsing through the URLs through a generic function; I'm assuming how MySpace handles their media is different than most typical providers hence the issue.

erikxcore commented 7 years ago

Using -v --dump-pages --print-traffic , the following extra data was obtained: [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: [u'--extract-audio', u'--audio-format', u'mp3', u'-v', u'--dump-pages', u'--print-traffic', u'--batch-file=batch.txt'] [debug] Batch file urls: [u'https://www.myspace.com/redlightgreenlight123/'] [debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2017.01.25 [debug] Python version 2.7.9 - Linux-4.0.0-040000-generic-x86_64-with-Ubuntu-15.04-vivid [debug] exe versions: avconv 11.2-6, avprobe 11.2-6, rtmpdump 2.4 [debug] Proxy map: {} [generic] redlightgreenlight123: Requesting header send: u'HEAD /redlightgreenlight123/ HTTP/1.1\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Encoding: gzip, deflate\r\nConnection: close\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/47.0 (Chrome)\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nHost: www.myspace.com\r\n\r\n' reply: 'HTTP/1.1 301 Moved Permanently\r\n' header: Location: https://myspace.com/redlightgreenlight123/ header: Connection: close header: Cache-Control: no-cache header: Pragma: no-cache send: u'GET /redlightgreenlight123/ HTTP/1.1\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Encoding: gzip, deflate\r\nConnection: close\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/47.0 (Chrome)\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nHost: myspace.com\r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' header: Vary: Accept-Encoding header: Set-Cookie: persistent_id=pid%3D594c10bc-12db-4c3e-a19b-904328f9aa46%26llid%3D%26lprid%3D%26lltime%3D; domain=.myspace.com; path=/; expires=Thu, 22 Jan 2037 16:46:51 GMT; httpOnly header: Set-Cookie: visit_id=976ef5b2-fe6c-48f1-9c90-716856c76b74; domain=.myspace.com; path=/; expires=Fri, 27 Jan 2017 17:16:51 GMT; httpOnly header: Set-Cookie: beacons_enabled=true; domain=.myspace.com; path=/; expires=Fri, 27 Jan 2017 17:16:51 GMT header: Set-Cookie: ads=adInitVisit%3D; domain=.myspace.com; path=/; expires=Sun, 26 Feb 2017 16:46:51 GMT header: Set-Cookie: player=sequenceId%3D-1%26paused%3Dtrue%26currentTime%3D0%26volume%3D0.5%26mute%3Dfalse%26shuffled%3Dfalse%26repeat%3Doff%26mode%3Dqueue%26radioEntity%3D%26radioMediaType%3D%26radioMediaId%3D%26radioCurrentTime%3D0%26pinned%3Dfalse%26streamStartDateTime%3D%26radioStreamStartDateTime%3D%26at%3D360%26incognito%3Dfalse%26allowSkips%3Dtrue%26ccOn%3Dfalse; domain=.myspace.com; path=/; expires=Sun, 26 Feb 2017 16:46:51 GMT header: X-TrackingId: e6f84445-5da5-4171-9e88-a67953bb1769 header: Cache-Control: no-cache header: Strict-Transport-Security: max-age=31536000 header: X-Frame-Options: SAMEORIGIN header: Content-Type: text/html; charset=utf-8 header: X-Response-Time: 149ms header: Content-Encoding: gzip header: Date: Fri, 27 Jan 2017 16:46:51 GMT header: Connection: keep-alive header: Transfer-Encoding: chunked [redirect] Following redirect to https://myspace.com/redlightgreenlight123/ [generic] redlightgreenlight123: Requesting header send: u'HEAD /redlightgreenlight123/ HTTP/1.1\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Encoding: gzip, deflate\r\nConnection: close\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/47.0 (Chrome)\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nHost: myspace.com\r\nCookie: player=sequenceId%3D-1%26paused%3Dtrue%26currentTime%3D0%26volume%3D0.5%26mute%3Dfalse%26shuffled%3Dfalse%26repeat%3Doff%26mode%3Dqueue%26radioEntity%3D%26radioMediaType%3D%26radioMediaId%3D%26radioCurrentTime%3D0%26pinned%3Dfalse%26streamStartDateTime%3D%26radioStreamStartDateTime%3D%26at%3D360%26incognito%3Dfalse%26allowSkips%3Dtrue%26ccOn%3Dfalse; visit_id=976ef5b2-fe6c-48f1-9c90-716856c76b74; persistent_id=pid%3D594c10bc-12db-4c3e-a19b-904328f9aa46%26llid%3D%26lprid%3D%26lltime%3D; ads=adInitVisit%3D; beacons_enabled=true\r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' header: Vary: Accept-Encoding header: Set-Cookie: visit_id=976ef5b2-fe6c-48f1-9c90-716856c76b74; domain=.myspace.com; path=/; expires=Fri, 27 Jan 2017 17:16:52 GMT; httpOnly header: Set-Cookie: ads=adInitVisit%3D; domain=.myspace.com; path=/; expires=Sun, 26 Feb 2017 16:46:52 GMT header: Set-Cookie: player=sequenceId%3D-1%26paused%3Dtrue%26currentTime%3D0%26volume%3D0.5%26mute%3Dfalse%26shuffled%3Dfalse%26repeat%3Doff%26mode%3Dqueue%26radioEntity%3D%26radioMediaType%3D%26radioMediaId%3D%26radioCurrentTime%3D0%26pinned%3Dfalse%26streamStartDateTime%3D%26radioStreamStartDateTime%3D%26at%3D360%26incognito%3Dfalse%26allowSkips%3Dtrue%26ccOn%3Dfalse; domain=.myspace.com; path=/; expires=Sun, 26 Feb 2017 16:46:52 GMT header: X-TrackingId: 9994e58a-66af-4f70-9f24-2b04e91d6d3e header: Cache-Control: no-cache header: Strict-Transport-Security: max-age=31536000 header: X-Frame-Options: SAMEORIGIN header: Content-Type: text/html; charset=utf-8 header: Content-Length: 132896 header: X-Response-Time: 182ms header: Date: Fri, 27 Jan 2017 16:46:52 GMT header: Connection: keep-alive WARNING: Falling back on generic information extractor. [generic] redlightgreenlight123: Downloading webpage send: u'GET /redlightgreenlight123/ HTTP/1.1\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Encoding: *\r\nConnection: close\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/47.0 (Chrome)\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nHost: myspace.com\r\nCookie: player=sequenceId%3D-1%26paused%3Dtrue%26currentTime%3D0%26volume%3D0.5%26mute%3Dfalse%26shuffled%3Dfalse%26repeat%3Doff%26mode%3Dqueue%26radioEntity%3D%26radioMediaType%3D%26radioMediaId%3D%26radioCurrentTime%3D0%26pinned%3Dfalse%26streamStartDateTime%3D%26radioStreamStartDateTime%3D%26at%3D360%26incognito%3Dfalse%26allowSkips%3Dtrue%26ccOn%3Dfalse; visit_id=976ef5b2-fe6c-48f1-9c90-716856c76b74; persistent_id=pid%3D594c10bc-12db-4c3e-a19b-904328f9aa46%26llid%3D%26lprid%3D%26lltime%3D; ads=adInitVisit%3D; beacons_enabled=true\r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' header: Vary: Accept-Encoding header: Set-Cookie: visit_id=976ef5b2-fe6c-48f1-9c90-716856c76b74; domain=.myspace.com; path=/; expires=Fri, 27 Jan 2017 17:16:52 GMT; httpOnly header: Set-Cookie: ads=adInitVisit%3D; domain=.myspace.com; path=/; expires=Sun, 26 Feb 2017 16:46:52 GMT header: Set-Cookie: player=sequenceId%3D-1%26paused%3Dtrue%26currentTime%3D0%26volume%3D0.5%26mute%3Dfalse%26shuffled%3Dfalse%26repeat%3Doff%26mode%3Dqueue%26radioEntity%3D%26radioMediaType%3D%26radioMediaId%3D%26radioCurrentTime%3D0%26pinned%3Dfalse%26streamStartDateTime%3D%26radioStreamStartDateTime%3D%26at%3D360%26incognito%3Dfalse%26allowSkips%3Dtrue%26ccOn%3Dfalse; domain=.myspace.com; path=/; expires=Sun, 26 Feb 2017 16:46:52 GMT header: X-TrackingId: daa2d932-bc55-4fa0-add0-81bedda6a3aa header: Cache-Control: no-cache header: Strict-Transport-Security: max-age=31536000 header: X-Frame-Options: SAMEORIGIN header: Content-Type: text/html; charset=utf-8 header: X-Response-Time: 121ms header: Content-Encoding: gzip header: Date: Fri, 27 Jan 2017 16:46:52 GMT header: Connection: keep-alive header: Transfer-Encoding: chunked [generic] Dumping request to https://myspace.com/redlightgreenlight123/ PCFET0NUWVBFI...48L3NjcmlwdD4KICAgIAogICAgPG5vc2NyaXB0PgogICAgICAgIDxpbWcgc3JjPSJodHRwczovL3NiLnNjb3JlY2FyZHJlc2VhcmNoLmNvbS9wP2MxPTImYzI9NDAwMDAwMiZjdj0yLjAmY2o9MSIgLz4KICAgIDwvbm9zY3JpcHQ+CiAgICAKICAgIAo8L2JvZHk+CjwvaHRtbD4K (this part was gigantic, guessing some hashed information or something along those lines) [generic] redlightgreenlight123: Extracting information ERROR: Unsupported URL: https://myspace.com/redlightgreenlight123/ Traceback (most recent call last): File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1711, in _real_extract doc = compat_etree_fromstring(webpage.encode('utf-8')) File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2526, in compat_etree_fromstring doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory))) File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2515, in _XML parser.feed(text) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed self._raiseerror(v) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror raise err ParseError: mismatched tag: line 69, column 2 Traceback (most recent call last): File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 694, in extract_info ie_result = ie.extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 359, in extract return self._real_extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2551, in _real_extract raise UnsupportedError(url) UnsupportedError: Unsupported URL: https://myspace.com/redlightgreenlight123/

erikxcore commented 7 years ago

Just updated (2017.01.29 ), same issue: youtube-dl --extract-audio --audio-format mp3 -v --batch-file=batch.txt [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: [u'--extract-audio', u'--audio-format', u'mp3', u'-v', u'--batch-file=batch.txt'] [debug] Batch file urls: [u'https://www.myspace.com/redlightgreenlight123/', u'https://myspace.com/loveyoumaidthebutcher/', u'https://myspace.com/knifetheglitter/', u'https://myspace.com/myonlyescapex/'] [debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2017.01.29 [debug] Python version 2.7.9 - Linux-4.0.0-040000-generic-x86_64-with-Ubuntu-15.04-vivid [debug] exe versions: avconv 11.2-6, avprobe 11.2-6, rtmpdump 2.4 [debug] Proxy map: {} [generic] redlightgreenlight123: Requesting header [redirect] Following redirect to https://myspace.com/redlightgreenlight123/ [generic] redlightgreenlight123: Requesting header WARNING: Falling back on generic information extractor. [generic] redlightgreenlight123: Downloading webpage [generic] redlightgreenlight123: Extracting information ERROR: Unsupported URL: https://myspace.com/redlightgreenlight123/ Traceback (most recent call last): File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1711, in _real_extract doc = compat_etree_fromstring(webpage.encode('utf-8')) File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2526, in compat_etree_fromstring doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory))) File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2515, in _XML parser.feed(text) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed self._raiseerror(v) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror raise err ParseError: mismatched tag: line 69, column 2 Traceback (most recent call last): File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 694, in extract_info ie_result = ie.extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 369, in extract return self._real_extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2551, in _real_extract raise UnsupportedError(url) UnsupportedError: Unsupported URL: https://myspace.com/redlightgreenlight123/

yan12125 commented 7 years ago

Related: #9295