Open Benjamin-Loison opened 2 years ago
As last two days the no-key service is using more than the quota of all keys, this issue is prioritized.
Giving tools and tips for searching keys on the web may be useful.
All YouTube Data API v3 keys start with AIzaSyA
or AIzaSyB
or AIzaSyC
or AIzaSyD
, more details here.
Searching for instance "AIzaSyA" YouTube
on Google (did "AIzaSy" YouTube
, "AIzaSyA" YouTube
, "AIzaSyB" YouTube
, "AIzaSyC" YouTube
and "AIzaSyD" YouTube
).
Searched AIzaSy
, AIzaSyB
(haven't done Code
for both) and still have to do Issues
for AIzaSyD
on GitHub (searching AIzaSyA
gives other results).
Could also use other search engines, Stack Overflow (the keys I have encountered on it were also treated, I also treated AIzaSyA
, AIzaSyB
, AIzaSyC
and AIzaSyD
, I also used an algorithm but I don't know if it manages edits on posts as most keys are removed after edits), GitLab (doesn't seem to find anything with AIzaSy*
)...
https://archive.ph/www.googleapis.com was also exploited. Precising /youtube/v3/
seems to not work.
What about Web Archive? https://web.archive.org/web/*/https://www.googleapis.com/youtube/v3/*
https://archive.org/developers/
Could show how to contribute with a short video with a Google account.
Such a tool would also be useful for the instance host as it would allow him to cleanly add a YouTube Data API v3, as currently by hand have to pay attention to not screw up keys.txt
.
Once done, change this StackOverflow answer to propose my no-key service, but as it is currently running out of quota, I am not advertising it. Done.
Note that a fresh instance will display for the no-key service: Currently this service is powered by 1 keys.
and potentially a PHP warning #23 (screenshot). Could display a custom error message if try to use the no-key service while it isn't powered by any YouTube Data API v3 key.
Related to #19.
Could make metrics, such as checkQuotaLogs.txt
and checkUnusualLogs.txt
, public. Adding metrics for how many quota we consume per day would be interesting too.
Have added for the moment https://yt.lemnoslife.com/metrics/ Note that as the no-key service requires to collect multiple YouTube Data API v3 keys, I assume that sharing some WIP details on it isn't having a high priority.
Can test many keys with this Python script:
I am set up a test at 9:01 AM UTC+2 (as at 9:00 AM we aren't running out of quota anymore) to test all YouTube Data API v3 keys that have currently exceeded their quota. If all keys pass this test, then maybe could allow keys having exceeding quota to be added. However someone could fill keys with manually set 0 quota limit...
The tests at 9:01 AM UTC+2 only returned exceeded quota. Will give a try at 10:01 AM UTC+2, otherwise should try every minute and if not passed the test once, then the key is definitely useless. Started the every minute test for all keys at Sat Oct 22 17:34:23 CEST 2022. None of the keys were useful for a single request during 24 hours.
Could advertise the possibility to share a YouTube Data API v3 key when the no-key service is running out of quota. This should be done at this line of code.
Setting up myself a notification system in case a or multiple check fails happen may make sense. Adding to metrics the delta logs since last retrieve (requiring authentication). Could precise the error on False
in order not to be notified every time it happens. Or can't just download last part of the file?
I added a notification system for each fail for the moment. However if for some reason, such as not enough disk space, making the system unable to write anymore logs, my check doesn't take into account such an absence of additional logs.
Check Apache 2 logs to see if some people shared their API keys by mistake.
Note that gunzip
doesn't output anything to stdout
and instead decompress and delete the .gz
compressed file, if you want the output on stdout
without decompressing use -c
.
find -name 'yt.lemnoslife.com-ssl--access.log*'
(gunzip -c 'yt.lemnoslife.com-ssl--access.log.*.gz' && cat yt.lemnoslife.com-ssl--access.log{,.1}) | grep AIzaSy | grep -v addKey
It is safe to add non existing files to the command above as there is a warning on stderr
which isn't grep
ed and so we got cat: FILE: No such file or directory
. As I execute above command everytime I archive the logs, at least filter the already used keys for the no-key service out, would make this process faster. This is the aim of the following algorithm:
searchKeysInLogs.py
:Found this way 22 keys with quota (no others) by checking latest website logs and checked the same way my old VAIO laptop, my ASUS, my computer (including my 2, 3 and 6 TB hard disks), OC3K and the VPS itself. Maybe haven't checked yt.lemnoslife.com-ssl--access.log
everywhere but hey I searched enough.
When adding a new key, make sure to make a backup, as if there isn't any space left on the device, we lose them all. It just happened... Adding a tool to monitor disk space usage would make sense.
https://yt.lemnoslife.com/noKey/videos?part=snippet&id=B-gHb2gPGIs returns for instance:
Incident temporarily resolved, as brought back a set of keys, but haven't restored yet all keys.
As found on my 6 TB hard disk my IP making 60 calls to addKey.php
between 20/Oct/2022:23:52:51 +0200
and 21/Oct/2022:00:09:34 +0200
, I guess I found the set of keys that was deleted, as I claimed on Discord to have added 29 keys on 21 Oct at 00:50 AM. Note that the last time I modified this post to add information about progress was on Oct 21, 2022, 12:49 AM GMT+2
. In addition that after running following algorithm for these calls to addKey.php
I added 21 keys (+ 3 manually added due to quota consumption).
import requests
def getURLContent(url):
return requests.get(url).text
for key in keys:
print(key)
url = f'https://yt.lemnoslife.com/addKey.php?key={key}'
result = getURLContent(url)
print(result)
Isn't there a way in PHP to keep a variable around across user HTTPS requests? That way we wouldn't read and write a file everytime we switch from a key to the other and so we wouldn't have faced this problem.
Note that the disk space seems mostly used by errors in yt.lemnoslife.com-ssl--error.log
which weighs more than 8 times more than yt.lemnoslife.com-ssl--access.log
, related to #23.
Example of filled logs (file size decreasing order):
File | Size (MB) | Lines |
---|---|---|
yt.lemnoslife.com-ssl--error.log.1 | 1,500 | 8,319,186 |
yt.lemnoslife.com-ssl--access.log.1 | 131.8 | 509,956 |
yt.lemnoslife.com-ssl--error.log.2.gz | 86.9 | 6,513,427 |
yt.lemnoslife.com-ssl--access.log.2.gz | 12.1 | 398,512 |
yt.lemnoslife.com-ssl--*.1
were filled from 09/Nov/2022:00:01:05 +0100
to 10/Nov/2022:00:44:31 +0100
(~24 hours).
yt.lemnoslife.com-ssl--*.log.2.gz
were filled from 08/Nov/2022:00:38:57 +0100
to 09/Nov/2022:00:01:02 +0100
(~24 hours).
Moved from LogLevel debug
to LogLevel info ssl:warn
in /etc/apache2/sites-available/ssl.yt.lemnoslife.com.conf
. See LogLevel documentation.
After a service apache2 restart
, it seems that there is nothing written to yt.lemnoslife.com-ssl--error.log
. I guess that it means that there isn't any error with the many requests that I still see in yt.lemnoslife.com-ssl--access.log
.
Have to wait logs to be rotated to download and use fresh empty files to see if my modification was a good change.
From Google account credentials can generate a YouTube Data API v3 key from a random project just by using curl? I think that due to 2FA (by default with Google) etc it isn't worth it.
May think about recoding some of YouTube Data API v3 features by reverse-engineering their YouTube UI, if we aren't able to face the many requests using quota for the no-key service.
Could add an email linked to the key added, if need to contact the key holder for future modification in the policy.
Could use supervariable from a HTTPS request to the other or something like that to avoid reading a file for each request for counting no-key service keys or git commit version used for instance or could at least simplify the file content we really need like:
$keysCountFile = '/var/www/ytPrivate/keysCount.txt';
$keysCount = file_get_contents($keysCountFile);
As described in #48, proceeded at 11:40 PM UTC+1 to logrotate --force /etc/logrotate.d/apache2
.
Next time we are really running out of quota advertise with a @everyone
on both Matrix and Discord to empower the no-key service.
Should add a mechanism to addKey.php
to add the keys on all controlled instances.
Maybe just retrieve addKey.php
of the other controlled instances from the one that the end-user is interacting with would do the job.
At 20:43 I got:
The YouTube operational API no-key service is detected as not working!
I tested just following this event the no-key endpoint on the three instances and everything was working fine. Logging what's wrong could be interesting in the case that it happens again.
Once will have access to moderator tools
privilege on Stack Overflow, could run again above algorithms to search for additional YouTube Data API v3 leaked keys.
Could also make web server logs search for YouTube Data API v3 keys be executed on private instances, as all its users don't seem be comfortable with this subject.
Should clean inter-instance key and other instances synchronization otherwise disabling the ability for anyone to provide a key seems to make sense.
Projects that enable the YouTube Data API have a default quota allocation of 1 million units per day
Note that projects that had enabled the YouTube Data API before April 20, 2016, have a different default quota for that API.
https://web.archive.org/web/20160828004328/https://developers.google.com/youtube/v3/getting-started
https://web.archive.org/web/20160404033352/https://developers.google.com/youtube/v3/getting-started is the most recent to snapshot to April 20, 2016 but does not mention how many quota is provided by default.
Does API explorer provides unlimited quota?
curl -s "https://content-youtube.googleapis.com/youtube/v3/search?part=snippet&q=test&key=AIzaSyBXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
minimizeCURL curl.sh 'youtube#searchResult'
https://console.cloud.google.com/apis/api/youtube.googleapis.com/quotas?project=my-project-XXXXXXXXXXXXX is not up-to-date in realtime, so let us make as many requests as possible and count them.
Maybe it expires quickly but thanks to web-scraping can easily recreate one.
counter=0
while [ 1 ]
do
echo "counter: $counter"
curl -s "https://content-youtube.googleapis.com/youtube/v3/search?part=snippet&q=$counter&key=AIzaSyBXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" -H 'X-Origin: https://explorer.apis.google.com' | jq '.items | length'
((counter++))
#break
done
leads to counter more than hundreds while having returned length still being default 5.
Same with https://www.googleapis.com/youtube/v3/search.
If necessary could also investigate OAuth and maybe use an account for each of these 4 cases (OAuth/key and URL) because of quota display delay.
As may add a form in the future to enable people to share their YouTube Data API v3 developer keys, this webpage could be used for this even if a short advertisement for it could be added to
index.php
. Should proceed to #17 before proceeding to this issue as adding keys may not be necessary with current quota usage.