Closed marc-weber1 closed 1 month ago
I can repro. Seems like the decrepit extractor has finally been fully broken
Started running into this issue today as well. using version 2024.9.27.0
In an effort to download a reel manually (https://www.instagram.com/foocey/reel/DAaER-1Oriq/) I dug into the http requests on the page and found this:
A get request to https://www.instagram.com/api/v1/media/3466101691097200810/info/
returned a json response (fields omitted for relevance) :
{
items: [
code: "DAaER-1Oriq",
pk: "3466101691097200810",
video_dash_manifest: "<?xml version="1.0" encoding="UTF-8"?>
<MPD xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 DASH-MPD.xsd" profiles="urn:mpeg:dash:profile:isoff-on-demand:2011" minBufferTime="PT2S" type="static" mediaPresentationDuration="PT9.473742S" FBManifestIdentifier="FgAYEGlnX2Rhc2hfYmFzZWxpbmUZNs64peCcmqQDxviG3se/pQT09Oy6m9KvBCIYGGRhc2hfbG5faGVhYWNfdmJyM19hdWRpbwA="><Period id="0" duration="PT9.473742S"><AdaptationSet id="0" contentType="video" frameRate="15360/512" subsegmentAlignment="true" par="9:16" FBUnifiedUploadResolutionMos="360:75.5"><SupplementalProperty schemeIdUri="urn:mpeg:mpegB:cicp:TransferCharacteristics" value="6"/><Representation id="924040302997031vd" bandwidth="578362" codecs="avc1.64001f" mimeType="video/mp4" sar="1:1" FBEncodingTag="dash_baseline_1_v1" FBContentLength="684396" FBPlaybackResolutionMos="0:100,360:94.7,480:91.2,720:86.6,1080:81.9" FBPlaybackResolutionMosConfidenceLevel="high" FBPlaybackResolutionCsvqm="0:100,360:98.19,480:96.8,720:95.4,1080:93.9" FBAbrPolicyTags="" width="720" height="1280" FBDefaultQuality="1" FBQualityClass="hd" FBQualityLabel="720p"><BaseURL>https://scontent-atl3-2.cdninstagram.com/o1/v/t16/f1/m86/83498DA848AB9E46281A9A432E450DA8_video_dashinit.mp4?efg=eyJ2aWRlb19pZCI6bnVsbCwidmVuY29kZV90YWciOiJpZy14cHZkcy5jbGlwcy5jMi1DMy5kYXNoX2Jhc2VsaW5lXzFfdjEifQ&_nc_ht=scontent-atl3-2.cdninstagram.com&_nc_cat=105&ccb=9-4&oh=00_AYB3GPMDTmF7vd_5mr5lbWMxlULgw_hfc1kCVGW7hpTjLw&oe=6700A82D&_nc_sid=f1f4f2</BaseURL><SegmentBase indexRange="892-947" timescale="15360" FBMinimumPrefetchRange="948-32377" FBFirstSegmentRange="948-431024" FBFirstSegmentDuration="5000" FBSecondSegmentRange="431025-684395" FBPrefetchSegmentRange="948-431024" FBPrefetchSegmentDuration="5000"><Initialization range="0-891"/></SegmentBase></Representation><Representation id="1230666434714938v" bandwidth="170919" codecs="avc1.4d001e" mimeType="video/mp4" sar="1:1" FBEncodingTag="dash_baseline_3_v1" FBContentLength="202255" FBPlaybackResolutionMos="0:100,360:71.4,480:64.7,720:57.8,1080:54.2" FBPlaybackResolutionMosConfidenceLevel="high" FBPlaybackResolutionCsvqm="0:100,360:86.1,480:79.8,720:73.2,1080:69.1" FBAbrPolicyTags="" width="360" height="640" FBQualityClass="sd" FBQualityLabel="360p"><BaseURL>https://scontent-atl3-2.cdninstagram.com/o1/v/t16/f1/m86/38454FFAF1B63022840A87BDC1DD5681_video_dashinit.mp4?efg=eyJ2aWRlb19pZCI6bnVsbCwidmVuY29kZV90YWciOiJpZy14cHZkcy5jbGlwcy5jMi1DMy5kYXNoX2Jhc2VsaW5lXzNfdjEifQ&_nc_ht=scontent-atl3-2.cdninstagram.com&_nc_cat=103&ccb=9-4&oh=00_AYAtWSQKMsJnx5RfpaD4UBwHQVp32cyvCnpxSL4OaWUVZA&oe=6700A53B&_nc_sid=f1f4f2</BaseURL><SegmentBase indexRange="887-942" timescale="15360" FBMinimumPrefetchRange="943-14044" FBFirstSegmentRange="943-126717" FBFirstSegmentDuration="5000" FBSecondSegmentRange="126718-202254" FBPrefetchSegmentRange="943-126717" FBPrefetchSegmentDuration="5000"><Initialization range="0-886"/></SegmentBase></Representation></AdaptationSet><AdaptationSet id="1" contentType="audio" subsegmentStartsWithSAP="1" subsegmentAlignment="true"><Representation id="1208355727138339ad" bandwidth="76469" codecs="mp4a.40.5" mimeType="audio/mp4" FBAvgBitrate="76469" audioSamplingRate="44100" FBEncodingTag="dash_ln_heaac_vbr3_audio" FBContentLength="91471" FBPaqMos="83.80" FBAbrPolicyTags="" FBDefaultQuality="1"><AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/><BaseURL>https://scontent-atl3-1.cdninstagram.com/v/t50.33967-16/461311387_502055819321624_5847132663744181196_n.mp4?_nc_cat=109&ccb=1-7&_nc_sid=9a5d50&efg=eyJ2ZW5jb2RlX3RhZyI6ImlnLXhwdmRzLmNsaXBzLmMyLUMzLmRhc2hfbG5faGVhYWNfdmJyM19hdWRpbyIsInZpZGVvX2lkIjpudWxsfQ%3D%3D&_nc_ohc=yR_MuIZaayoQ7kNvgGD-DTc&_nc_ht=scontent-atl3-1.cdninstagram.com&_nc_gid=AL6eZp4GBz9V1ZoIf08mSWd&oh=00_AYA75_hrxi1U_XeE4DoLsSLXsSNnN7i2eNFcYb9FnlUAbA&oe=67049FE6</BaseURL><SegmentBase indexRange="824-915" timescale="44100" FBMinimumPrefetchRange="916-1259" FBFirstSegmentRange="916-21518" FBFirstSegmentDuration="2021" FBSecondSegmentRange="21519-40307" FBPrefetchSegmentRange="916-40307" FBPrefetchSegmentDuration="4017"><Initialization range="0-823"/></SegmentBase></Representation></AdaptationSet></Period></MPD>"
]
}
The format URLs can be extracted from this and html decoded. I was able to download video and audio from those.
I don't know if this helps, but I hope so!
@adanvdo That is how the extractor currently works when you pass logged-in cookies (so at least the extractor is not broken when logged-in?). Does it work w/o cookies? IIRC it used to work w/o cookies a few times and then you'd be blocked for 24+ hours
(Perennial warning that passing logged-in cookies to yt-dlp for this site can get your account permanently banned)
@bashonly when I am logged out, there is no request to api/v1/media/3466101691097200810/info/
instead, there is an xhr query post request.
that returns json in this format:
{
data: {
xdt_shortcode_media: {
id: "3466101691097200810",
shortcode: "DAaER-1Oriq",
video_url: "https://scontent-atl3-2.cdninstagram.com/o1/v/t16/f1/m86/83498DA848AB9E46281A9A432E450DA8_video_dashinit.mp4?stp=dst-mp4&efg=eyJxZV9ncm91cHMiOiJbXCJpZ193ZWJfZGVsaXZlcnlfdnRzX290ZlwiXSIsInZlbmNvZGVfdGFnIjoidnRzX3ZvZF91cmxnZW4uY2xpcHMuYzIuNzIwLmJhc2VsaW5lIn0&_nc_cat=105&vs=924040302997031_1936272661&_nc_vs=HBksFQIYUmlnX3hwdl9yZWVsc19wZXJtYW5lbnRfc3JfcHJvZC84MzQ5OERBODQ4QUI5RTQ2MjgxQTlBNDMyRTQ1MERBOF92aWRlb19kYXNoaW5pdC5tcDQVAALIAQAVAhg6cGFzc3Rocm91Z2hfZXZlcnN0b3JlL0dKc05meHNZb2NUNm5jZ0JBTXd6RHFFdE1DVlJicV9FQUFBRhUCAsgBACgAGAAbABUAACbk7YSS3MWcQBUCKAJDMywXQCLul41P3zsYEmRhc2hfYmFzZWxpbmVfMV92MREAdf4HAA%3D%3D&_nc_rid=ff6ff092d5&ccb=9-4&oh=00_AYCW3JUQJZ6vXrCQzBl1nCEKLzGQCtd7eMaBu2-GcC1nYQ&oe=6700A82D&_nc_sid=d885a2",
dash_info: {
video_dash_manifest: "<?xml version="1.0" encoding="UTF-8"?>
<MPD xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 DASH-MPD.xsd" profiles="urn:mpeg:dash:profile:isoff-on-demand:2011" minBufferTime="PT2S" type="static" mediaPresentationDuration="PT9.473742S" FBManifestIdentifier="FgAYEGlnX2Rhc2hfYmFzZWxpbmUZNs64peCcmqQDxviG3se/pQT09Oy6m9KvBCIYGGRhc2hfbG5faGVhYWNfdmJyM19hdWRpbwA="><Period id="0" duration="PT9.473742S"><AdaptationSet id="0" contentType="video" frameRate="15360/512" subsegmentAlignment="true" par="9:16" FBUnifiedUploadResolutionMos="360:75.5"><SupplementalProperty schemeIdUri="urn:mpeg:mpegB:cicp:TransferCharacteristics" value="6"/><Representation id="924040302997031vd" bandwidth="578362" codecs="avc1.64001f" mimeType="video/mp4" sar="1:1" FBEncodingTag="dash_baseline_1_v1" FBContentLength="684396" FBPlaybackResolutionMos="0:100,360:94.7,480:91.2,720:86.6,1080:81.9" FBPlaybackResolutionMosConfidenceLevel="high" FBPlaybackResolutionCsvqm="0:100,360:98.19,480:96.8,720:95.4,1080:93.9" FBAbrPolicyTags="" width="720" height="1280" FBDefaultQuality="1" FBQualityClass="hd" FBQualityLabel="720p"><BaseURL>https://scontent-atl3-2.cdninstagram.com/o1/v/t16/f1/m86/83498DA848AB9E46281A9A432E450DA8_video_dashinit.mp4?efg=eyJ2aWRlb19pZCI6bnVsbCwidmVuY29kZV90YWciOiJpZy14cHZkcy5jbGlwcy5jMi1DMy5kYXNoX2Jhc2VsaW5lXzFfdjEifQ&_nc_ht=scontent-atl3-2.cdninstagram.com&_nc_cat=105&ccb=9-4&oh=00_AYB3GPMDTmF7vd_5mr5lbWMxlULgw_hfc1kCVGW7hpTjLw&oe=6700A82D&_nc_sid=f1f4f2</BaseURL><SegmentBase indexRange="892-947" timescale="15360" FBMinimumPrefetchRange="948-32377" FBFirstSegmentRange="948-431024" FBFirstSegmentDuration="5000" FBSecondSegmentRange="431025-684395" FBPrefetchSegmentRange="948-431024" FBPrefetchSegmentDuration="5000"><Initialization range="0-891"/></SegmentBase></Representation><Representation id="1230666434714938v" bandwidth="170919" codecs="avc1.4d001e" mimeType="video/mp4" sar="1:1" FBEncodingTag="dash_baseline_3_v1" FBContentLength="202255" FBPlaybackResolutionMos="0:100,360:71.4,480:64.7,720:57.8,1080:54.2" FBPlaybackResolutionMosConfidenceLevel="high" FBPlaybackResolutionCsvqm="0:100,360:86.1,480:79.8,720:73.2,1080:69.1" FBAbrPolicyTags="" width="360" height="640" FBQualityClass="sd" FBQualityLabel="360p"><BaseURL>https://scontent-atl3-2.cdninstagram.com/o1/v/t16/f1/m86/38454FFAF1B63022840A87BDC1DD5681_video_dashinit.mp4?efg=eyJ2aWRlb19pZCI6bnVsbCwidmVuY29kZV90YWciOiJpZy14cHZkcy5jbGlwcy5jMi1DMy5kYXNoX2Jhc2VsaW5lXzNfdjEifQ&_nc_ht=scontent-atl3-2.cdninstagram.com&_nc_cat=103&ccb=9-4&oh=00_AYAtWSQKMsJnx5RfpaD4UBwHQVp32cyvCnpxSL4OaWUVZA&oe=6700A53B&_nc_sid=f1f4f2</BaseURL><SegmentBase indexRange="887-942" timescale="15360" FBMinimumPrefetchRange="943-14044" FBFirstSegmentRange="943-126717" FBFirstSegmentDuration="5000" FBSecondSegmentRange="126718-202254" FBPrefetchSegmentRange="943-126717" FBPrefetchSegmentDuration="5000"><Initialization range="0-886"/></SegmentBase></Representation></AdaptationSet><AdaptationSet id="1" contentType="audio" subsegmentStartsWithSAP="1" subsegmentAlignment="true"><Representation id="1208355727138339ad" bandwidth="76469" codecs="mp4a.40.5" mimeType="audio/mp4" FBAvgBitrate="76469" audioSamplingRate="44100" FBEncodingTag="dash_ln_heaac_vbr3_audio" FBContentLength="91471" FBPaqMos="83.80" FBAbrPolicyTags="" FBDefaultQuality="1"><AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/><BaseURL>https://scontent-atl3-1.cdninstagram.com/v/t50.33967-16/461311387_502055819321624_5847132663744181196_n.mp4?_nc_cat=109&ccb=1-7&_nc_sid=9a5d50&efg=eyJ2ZW5jb2RlX3RhZyI6ImlnLXhwdmRzLmNsaXBzLmMyLUMzLmRhc2hfbG5faGVhYWNfdmJyM19hdWRpbyIsInZpZGVvX2lkIjpudWxsfQ%3D%3D&_nc_ohc=yR_MuIZaayoQ7kNvgGD-DTc&_nc_ht=scontent-atl3-1.cdninstagram.com&_nc_gid=A9OFFypqKwVyNI_wmS055Nw&oh=00_AYAqsOfU7JlCi04cMgqMHK_6Jj7JNhFvapGlnF5TpShqIg&oe=67049FE6</BaseURL><SegmentBase indexRange="824-915" timescale="44100" FBMinimumPrefetchRange="916-1259" FBFirstSegmentRange="916-21518" FBFirstSegmentDuration="2021" FBSecondSegmentRange="21519-40307" FBPrefetchSegmentRange="916-40307" FBPrefetchSegmentDuration="4017"><Initialization range="0-823"/></SegmentBase></Representation></AdaptationSet></Period></MPD>
"
}
}
}
}
I can use the video_url
value with yt-dlp fine
Looks like a fetch command like this works for getting the url (try it yourself):
fetch("https://www.instagram.com/graphql/query", {
"credentials": "include",
"headers": {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:130.0) Gecko/20100101 Firefox/130.0",
"Accept": "*/*",
"Accept-Language": "en-CA,en-US;q=0.7,en;q=0.3",
"Content-Type": "application/x-www-form-urlencoded",
"X-FB-Friendly-Name": "PolarisPostActionLoadPostQueryQuery",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin"
},
"referrer": "https://www.instagram.com/reel/DAO49r3SsBR/",
"body": "variables=%7B%22shortcode%22%3A%22DAO49r3SsBR%22%2C%22fetch_tagged_user_count%22%3Anull%2C%22hoisted_comment_id%22%3Anull%2C%22hoisted_reply_id%22%3Anull%7D&server_timestamps=true&doc_id=8845758582119845",
"method": "POST",
"mode": "cors"
}).then(resp => resp.json())
.then(resp => console.log(resp.data.xdt_shortcode_media.video_url));
the variables param in the body is just a url-encoded version of {"shortcode":"DAO49r3SsBR","fetch_tagged_user_count":null,"hoisted_comment_id":null,"hoisted_reply_id":null}
so this returns a URL directly from shortcode which is cool
The only question now is how to get the document ID 8845758582119845
- but it seems to be the same for me and a friend? and does not seem to change - not sure if it's a good idea to hardcode 🥴
I think it may be okay to hardcode the doc_id
. It seems that is the same approach currently used by the extractor:
https://github.com/yt-dlp/yt-dlp/blob/e59c82a74cda5139eb3928c75b0bd45484dbe7f0/yt_dlp/extractor/instagram.py#L438
Meta seems to have moved Instagram over to their Relay client for making GraphQL queries. There is a set of doc_id
s which correspond to what are essentially GQL query 'presets', saved on the server side, to reduce the amount of data the client needs to send. This also prevents arbitrary queries from working. 8845758582119845
seems to be the doc_id
of interest, as it provides the video URL at data.xdt_shortcode_media.video_url
in the JSON response.
The bare minimum for making a successful request is as follows, updating the shortcode
as needed
curl --request POST \
--url https://www.instagram.com/graphql/query \
--data 'variables={"shortcode":"DAJYHpwCjMP"}' \
--data doc_id=8845758582119845
See Persisted Queries in the Relay documentation.
Any update on this case?
@bashonly When will it be fixed?
@bashonly When will it be fixed?
this is a community maintained project. All the devs that work on this have lives and don't owe us anything. Just be patient.
In the mean time, use your browser web tools to get the json packages that contain the video_url
and use that url with yt-dlp
@bashonly When will it be fixed?
this is a community maintained project. All the devs that work on this have lives and don't owe us anything. Just be patient.
In the mean time, use your browser web tools to get the json packages that contain the
video_url
and use that url with yt-dlp
hey, can you please tell how to use this thing, i have no idea. we open the ingtagram video then open dev tool after that we go to network tabs ??
quick one liner to find the urls based on @tetra-fox 's comment
curl --request POST \
--url https://www.instagram.com/graphql/query \
--data 'variables={"shortcode":"CHANGE THIS TO THE VIDEO ID"}' \
--data doc_id=8845758582119845 | jq | grep video_url
@bashonly When will it be fixed?
this is a community maintained project. All the devs that work on this have lives and don't owe us anything. Just be patient. In the mean time, use your browser web tools to get the json packages that contain the
video_url
and use that url with yt-dlphey, can you please tell how to use this thing, i have no idea. we open the ingtagram video then open dev tool after that we go to network tabs ??
open the dev tools and network tab and then refresh the reel page. switch back to Dev tools and view the network tab results. look for the entry with the name "query". click the response tab to view the json response. the video_url is in that
@bashonly When will it be fixed?
this is a community maintained project. All the devs that work on this have lives and don't owe us anything. Just be patient. In the mean time, use your browser web tools to get the json packages that contain the
video_url
and use that url with yt-dlphey, can you please tell how to use this thing, i have no idea. we open the ingtagram video then open dev tool after that we go to network tabs ??
open the dev tools and network tab and then refresh the reel page. switch back to Dev tools and view the network tab results. look for the entry with the name "query". click the response tab to view the json response. the video_url is in that
thank you so much
DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
Checklist
Region
Canada
Provide a description that is worded well enough to be understood
Thought it was a rate limit, but I tried from multiple different IPs and all failed, while they all worked from firefox browser
example reel (loud): https://www.instagram.com/reel/DAgxVRCsDgA
Provide verbose output that clearly demonstrates the problem
yt-dlp -vU <your command line>
)'verbose': True
toYoutubeDL
params instead[debug] Command-line config
) and insert it belowComplete Verbose Output