Open Miserlou opened 6 years ago
The 429 seems to be because Soundscrape is getting too popular and is hitting some rate limits (see https://developers.soundcloud.com/docs/api/rate-limits#global-limit)
If I manually request the URL that's throwing an error, I get something like this:
{
"errors": [
{
"meta": {
"rate_limit": {
"bucket": "by-client",
"max_nr_of_requests": 15000,
"time_window": "PT24H",
"name": "plays"
},
"remaining_requests": 0,
"reset_time": "2018/01/06 13:38:36 +0000"
}
}
]
}
I'm uncertain how that can be easily resolved though :(
Ah, dang it. I need more keys and to rotate them.
I have the same issue. Although I am using soundscrape for some years now, I am not very experienced with soundcloud.com. Just have a cronjob on my linux server, downloading a daily news show from soundcloud every day automatically.
Since some days I am experiencing the same issue with HTTPerror 429.
I need more keys and to rotate them.
Hope my question is not too noobish, but what does that mean? Would be nice to get a hint how to solve this issue.
Thank you!
@ontheair81 Miserlou means that he needs more API keys to circumvent the 15K download limit that is imposed by SoundCloud. However, as SoundCloud is not allowing any new developers to sign up for new API keys, it's something that's hard or even impossible to fix.
If you happen to have an API key already, you could replace the one that's built-in into SoundScrape, or help all users forward and pass the key to Miserlou.
Thank you for clarifying! Now I understand the issue. Unfortunately I dont have an API key, so I can not help myself or other users by sharing.
So I think we just can hope that the limits will be increased by soundcloud. Anyway, thank you for the information!
Is there anything we can do to help? ie, search for more API keys, add per-client rate limits, etc?
I think that if someone would be able to bring in more keys, @Miserlou would appreciate it.
Today the 15K downloads were used up in about 3 hours...
Hey there.
Issues #206 & #204 are related.
I think Sounscrape cannot keep going with this client_id & secret key or even with many of them as it will always reach Soundcloud API limits at some point.
As Soundcloud closed new app registering I think it would be better to just ask the user to login and then scrap the DOM (with Selenium + Chrome/Firefox in --headless) to get the token and then download the tracks. If you go to urls like:
https://api.soundcloud.com/i1/tracks/387417257/streams?client_id=MgT8dvRJVcFR9fI5Szar82usLfSQdg3n
You then get a response like:
{
"http_mp3_128_url": "https://cf-media.sndcdn.com/XFlrBjPMUKHI.128.mp3?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLW1lZGlhLnNuZGNkbi5jb20vWEZsckJqUE1VS0hJLjEyOC5tcDMiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE1MTc3MzY3NTB9fX1dfQ__&Signature=wfLUQ5-w9NxFP7EOWqB0LN9junfC-DDb4ZNJ8rRJ0MNI0YorEGiCy13V4-nwatJ9G1TX8osBMtfzD~UfEyC-oRifpYWT~0sEnRQ19S9QQpYVg8QoDPCaCrfxMRxNHGpH1WQvGCdgYR5mI6mdj9gwj10ML~hTBbt7AE0~2jOKKy1nvZftydMjTt3cYGdR1gtUP2-J741be4TGzO~pSonV~rVgqbhntatlyTTo9uWj9CCwvGvX4sexZBXS3KPA-76XbqW1wXLbZoDKqtrLk2I9rQnWHyK~OvqUfoJE53HOE6eSS4Ql4JwutQ59sX6w8gao~yqwJFW988Y-MtEtS7zb4A__&Key-Pair-Id=APKAJAGZ7VMH2PFPW6UQ",
"hls_mp3_128_url": "https://cf-hls-media.sndcdn.com/playlist/XFlrBjPMUKHI.128.mp3/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLWhscy1tZWRpYS5zbmRjZG4uY29tL3BsYXlsaXN0L1hGbHJCalBNVUtISS4xMjgubXAzL3BsYXlsaXN0Lm0zdTgiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE1MTc3MzY3NTB9fX1dfQ__&Signature=lpHe8hejzHGlLtOiuF1b2esSGUu8mgSCa1Y6wAHb0fioBJV5DLzWy~7XGvaSsxxzlJVSu~X2bGCmmQ0kdU0xwP7dLQX9enl2QJwhm3kkggfAfsCFtFToMmA6BxEBaeMtwwC0ePLRzvSaw7mTLBV2vURUxky7P2RpJD87MURx0n8-mGpsaf1rwMKM9dRLKW6kMFqbkppjl4~geuA1SRC12lWHRV8socCEwfu-evCU~Ds~pa8aX2bSj~BK1Erai0E7ht7~jQImxqVae2gyiqU60QofsYIjyWLbyEcLmdElGtdw3NUEP1TtEnAfTK8-zW6z0DifKmoLV-~jn8QstiCgQg__&Key-Pair-Id=APKAJAGZ7VMH2PFPW6UQ",
"hls_opus_64_url": "https://cf-hls-opus-media.sndcdn.com/playlist/XFlrBjPMUKHI.64.opus/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLWhscy1vcHVzLW1lZGlhLnNuZGNkbi5jb20vcGxheWxpc3QvWEZsckJqUE1VS0hJLjY0Lm9wdXMvcGxheWxpc3QubTN1OCIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTUxNzczNjc1MH19fV19&Signature=nlLbtT5xpScnlENCznAyPlX0aRvMHA-Y1AXQieVjQg~dWWskwO7b2AB1LDydy7~fzmOkdu6GLoQyK174GLD1fcjy02FD4UQql799CEBtQ4Ker7YzNy4l78F3kbrU03KqcULWot2DvZpuUvNV3nGfUDobwCkC6JLPsx0dkmek8XyigeEemAsbQbHNWPissM10C4LgGzbekQhLwRrOVEQp9ixV7y8z6DghuOcrRg0RTbz~R~NKKdLP3A5tEnLcPPjv1dsyfK~B0dq~ddWFEbH7bPlcB0qLM7TsmGCEHtyTjfeFiYKtKpYZrXegKyUg-nTZcdenIHKsLXAELy5HiUXuLw__&Key-Pair-Id=APKAJAGZ7VMH2PFPW6UQ",
"preview_mp3_128_url": "https://cf-preview-media.sndcdn.com/preview/0/30/XFlrBjPMUKHI.128.mp3?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLXByZXZpZXctbWVkaWEuc25kY2RuLmNvbS9wcmV2aWV3LzAvMzAvWEZsckJqUE1VS0hJLjEyOC5tcDMiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE1MTc3MjkzMzF9fX1dfQ__&Signature=GiyBE4RM-ed6uOHn283mYHiRZVh-4v9vJ3KRjFX2pHZE-G~eC5CZBWN4nlqc18E5KJKpVI3UInRTnloIscatUuAtRKtKjiDR0kn5MxhQA7k2dLGq-2V0KvVCIm1eoSXRDkwFOomg15l62d5b7wWoL-1XJomC7JiEb8ayxPPEr5FRmip9cP05dk57OvqziIwjMIfCv7ubbkSxJ7s-lh9nUvojagQWQ2H~GT-50R0yYoYcFLvG~QpW8HiT2SBIOPT07M9wbavRbF7dqcW1xyStL2QHSWMcESBZBG-ea47oEVuJaYP57FTVCCSGjzbjgKpbWwNup1OSJ53vro50PnKoZw__&Key-Pair-Id=APKAJAGZ7VMH2PFPW6UQ"
}
You can then simply download with "http_mp3_128_url".
The idea would be :
What do you think ?
I can do it if you like but as I don't know Sounscrape lib I've no idea how much refactoring I would have to do to make that fit.
As a side note, it is possible to find client_ids searching GitHub with client_id
I'll try reverse engineering their client code
for most cases it would be enough to scrape anonymously without login. of course the problem is that soundcloud can change their site to break this, over and over again.
or they could not bother. fyi, there is an unofficial play store client that works great by scraping regual html (no play api use) and google never broke it.
in the interest of reuse, please consider implementing the scraper as a separate scraping lib.
Would it help if users could provide their own keys?
+1 more for this issue. Is there a configurable option to supply one's own API key?
+1, same problem today. Perhaps implement configuration options to provide our own keys as @erezsh suggested?
My cron job running at 00:30 has been consistently failing the last few days, keys already used up in the first half hour of the day (EST). Any solutions or work arounds?
You could also automate something like this which seems to be able to download without too much grief.
My cron job running at 00:30 has been consistently failing the last few days, keys already used up in the first half hour of the day (EST). Any solutions or work arounds?
afaik i know it's a rolling 24 hour period, not complete days.
so surely it must be simple to generate and plug in our own API keys? i am new to this but will have a play.