Benjamin-Loison / YouTube-operational-API

YouTube operational API works when YouTube Data API v3 fails.
401 stars 52 forks source link

`Method doesn't allow unregistered callers` with the no-key endpoint due to empty line #240

Open Benjamin-Loison opened 9 months ago

Benjamin-Loison commented 9 months ago

This time the 5th line was empty on the official instance.

I believe that this issue comes from the fact that the web server treats multiple requests at the same time. Could for instance introduce a blocking mutex on the file read and write but avoiding file read would be nice thanks to across requests memory.

Related to #237 and more precisely this comment.

Benjamin-Loison commented 9 months ago

It seems that a potentially not considered YouTube Data API v3 error that I faced may break the no-key endpoint. Just a reminder of the well-known YouTube Data API v3 quota exceeded error:

{
    "error": {
        "code": 403,
        "message": "The request cannot be completed because you have exceeded your <a href=\"/youtube/v3/getting-started#quota\">quota</a>.",
        "errors": [
            {
                "message": "The request cannot be completed because you have exceeded your <a href=\"/youtube/v3/getting-started#quota\">quota</a>.",
                "domain": "youtube.quota",
                "reason": "quotaExceeded"
            }
        ]
    }
}

https://developers.google.com/youtube/v3/docs/errors https://developers.google.com/youtube/v3/docs/core_errors

It seems that using 403 instead of the actual reason would treat both cases.

At least some other temporary error that I sometimes faced would not be a problem as it is 503: https://developers.google.com/youtube/v3/docs/core_errors#SERVICE_UNAVAILABLE

Could maybe be more precise among the 403 errors, as some are temporary in theory, like: concurrentLimitExceeded, dailyLimitExceeded (should verify not being in the case The daily quota limit has been reached, and the project has been blocked due to abuse. See the [Google APIs compliance support form](http://support.google.com/code/go/developer_compliance) to help resolve the issue.), rateLimitExceeded, servingLimitExceeded, userRateLimitExceeded, variableTermExpiredDailyExceeded and variableTermLimitExceeded.

Should check if can put 0 limit for above limits and if so, then should verify that the key worked once before adding.

As most people do not use such features and assume that the no-key endpoint is not too much used to trigger a specific one, let us drop at least temporarily forbidden keys, i.e. other than quota exceeded 403 keys.

limitExceeded seems ambiguous.

dailyLimitExceededUnreg, rateLimitExceededUnreg and userRateLimitExceededUnreg seem to require signing.

Also just faced:

{
    "error": {
        "code": 400,
        "message": "API key expired. Please renew the API key.",
        "errors": [
            {
                "message": "API key expired. Please renew the API key.",
                "domain": "global",
                "reason": "badRequest"
            }
        ],
        "status": "INVALID_ARGUMENT",
        "details": [
            {
                "@type": "type.googleapis.com/google.rpc.ErrorInfo",
                "reason": "API_KEY_INVALID",
                "domain": "googleapis.com",
                "metadata": {
                    "service": "youtube.googleapis.com"
                }
            }
        ]
    }
}

Faced on https://yt.lemnoslife.com OP error and afterwards had less than expected YouTube Data API v3 keys it seems as a significant part was missing:

wc -l ytPrivate/keys.txt
1023 /var/www/yt/ytPrivate/keys.txt
sed -i '/^\s*$/d' ytPrivate/keys.txt
wc -l /var/www/yt/ytPrivate/keys.txt
1023 /var/www/yt/ytPrivate/keys.txt
python3 test.py
correct=0 quotaExceeded=1024: 100%|█| 1024/1024 [01:39<0

Afterwards got:

./test.py
correct=2319 quotaExceeded=1306 toRemove=2: 100%|██████████████████████████████████████████████████████████████████████████| 3627/3627 [08:15<00:00,  7.32it/s]
#!/usr/bin/python3

import requests
import json
from tqdm import tqdm

keysFilePath = 'ytPrivate/keys.txt'

with open(keysFilePath) as f:
    lines = f.read().splitlines()

def getValue(obj, path):
    if path == '':
        return obj
    pathParts = path.split('/')
    key = pathParts[0]
    key = int(key) if key.isdigit() else key
    try:
        newObj = obj[key]
        return getValue(newObj, '/'.join(pathParts[1:]))
    except:
        return None

correctKeys = set()
quotaExceededKeys = set()
toRemove = 0
progressBar = tqdm(lines)
for line in progressBar:
    url = 'https://www.googleapis.com/youtube/v3/videos'
    params = {
        'part': 'snippet',
        'id': '_ZPpU7774DQ',
        'key': line,
    }
    data = requests.get(url, params = params).json()
    if 'error' in data:
        error = data['error']
        if error['errors'][0]['domain'] != 'youtube.quota':
            message = error['message']
            if message == 'API key expired. Please renew the API key.' or message.startswith('has been suspended.') or message == 'API key not valid. Please pass a valid API key.' or message == 'API Key not found. Please pass a valid API key.' or message.startswith('YouTube Data API v3 has not been used in project ') or message.endswith('are blocked.'):
                toRemove += 1
            else:
                print(line)
                print(json.dumps(data, indent = 4))
                break
        else:
            quotaExceededKeys.add(line)
    elif getValue(data, 'items/0/snippet/channelId') == 'UCWeg2Pkate69NFdBeuRFTAw':
        correctKeys.add(line)
    else:
        print(line)
        print(json.dumps(data, indent = 4))
        break
    progressBar.set_description(f'{len(correctKeys)=} {len(quotaExceededKeys)=} {toRemove=}')

# Manually sort keys like PHP script would also be nice but then use the PHP script.
# In theory could better sort keys based on original order, maybe current implementation keep it?
with open(keysFilePath, 'w') as f:
    f.write('\n'.join(list(correctKeys) + list(quotaExceededKeys)))

Should make the no-key endpoint use a function and make a PHP script based on this function to make sure that it is the production no-key endpoint behavior. Could use the Stack Overflow answer 27147177 to have a progress bar.

Hence, solved this issue thanks to uploading the complete key set.

Benjamin-Loison commented 9 months ago

Previously used:

#!/usr/bin/python3

import requests
import json

def curl(url):
    return requests.get(url).json()

with open('keys.txt') as f:
    lines = f.read().splitlines()

for line in lines:
    print(line)
    url = f'https://www.googleapis.com/youtube/v3/videos?part=snippet&id=_-MNbRpjSa0&key={line}'
    contentJSON = curl(url)
    contentStr = json.dumps(contentJSON, indent = 4)
    #print(contentStr)
    if 'error' in contentJSON:
        code = contentJSON['error']['code']
        if not 'quota' in contentJSON['error']['message']:
            print(contentStr)
        #if code != 403: # quota it seems - only ?
        #   print(contentStr)
Benjamin-Loison commented 9 months ago

Related to #209.

Benjamin-Loison commented 9 months ago
The YouTube operational API instance `yt` `noKey/videos` is not working correctly!
./test.py
correct=1 quotaExceeded=1432 toRemove=0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1433/1433 [02:20<00:00, 10.21it/s]
cp ~/keys.txt ~/yt/ytPrivate/keys.txt && wc -l ~/yt/ytPrivate/keys.txt && python3 test.py && wc -l ~/yt/ytPrivate/keys.txt
3627 /var/www/yt/ytPrivate/keys.txt
correct=1672 quotaExceeded=1945 toRemove=10: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3627/3627 [06:20<00:00,  9.54it/s]
3616 /var/www/yt/ytPrivate/keys.txt

After a few minutes:

The YouTube operational API instance `yt` `noKey/videos` is not working correctly!
wc -l ~/yt/ytPrivate/keys.txt
1431 /var/www/yt/ytPrivate/keys.txt
cp ~/keys.txt ~/yt/ytPrivate/keys.txt && ./test.py
# forgot the `cp`, hence biased the results
Benjamin-Loison commented 9 months ago
wc -l ~/yt/ytPrivate/keys.txt
1431 /var/www/yt/ytPrivate/keys.txt
cp ~/keys.txt ~/yt/ytPrivate/keys.txt && wc -l ~/yt/ytPrivate/keys.txt && python3 test.py && wc -l ~/yt/ytPrivate/keys.txt 
3616 /var/www/yt/ytPrivate/keys.txt
len(correctKeys)=1416 len(quotaExceededKeys)=2201 toRemove=0: 100%|████████████████████████████████████| 3617/3617 [07:07<00:00,  8.46it/s]
3616 /var/www/yt/ytPrivate/keys.txt
wc -l ~/yt/ytPrivate/keys.txt
1431 /var/www/yt/ytPrivate/keys.txt

flock does not seem to help, maybe the issue is the new key set management.

https://codeberg.org/Benjamin_Loison/YouTube_captions_search_engine/src/branch/master/website/websocket.php https://www.php.net/manual/en/function.flock.php

Benjamin-Loison commented 9 months ago

Well I do not notice currently any issue with the no-key endpoint, I restored my notification system to be notified when an issue is going on.

Source: Discord

1125 /var/www/yt/ytPrivate/keys.txt

Now have an issue and have:

len(correctKeys)=1 len(quotaExceededKeys)=1069 toRemove=0: 100%|█| 1126/1126
cp ~/keys.txt ~/yt/ytPrivate/keys.txt && wc -l ~/yt/ytPrivate/keys.txt && python3 test.py && wc -l ~/yt/ytPrivate/keys.txt 
3616 /var/www/yt/ytPrivate/keys.txt
len(correctKeys)=2002 len(quotaExceededKeys)=1614 toRemove=1: 100%|████████████████████████████████████| 3617/3617 [06:45<00:00,  8.93it/s]
3615 /var/www/yt/ytPrivate/keys.txt
wc -l /var/www/yt/ytPrivate/keys.txt
1125 /var/www/yt/ytPrivate/keys.txt
<?php

$fp = fopen('test.txt', 'r+');
echo 'Waiting lock...';
$blocking = True;
flock($fp, LOCK_EX, $blocking);
echo 'Have lock, now sleeping...';
sleep(10);
echo 'Slept.';
flock($fp, LOCK_UN);
fclose($fp);

gives the expected behavior, let us give a try with such file locking to see if similar behavior happen.

I stopped apache2 during:

cp ~/keys.txt ~/yt/ytPrivate/keys.txt && wc -l ~/yt/ytPrivate/keys.txt && python3 test.py && wc -l ~/yt/ytPrivate/keys.txt
3616 /var/www/yt/ytPrivate/keys.txt
len(correctKeys)=1992 len(quotaExceededKeys)=1624 toRemove=1: 100%|████████████████████████████████████| 3617/3617 [06:42<00:00,  9.00it/s]
3615 /var/www/yt/ytPrivate/keys.txt

Related to #165.