Open Benjamin-Loison opened 1 year ago
Firsly knowing how to do without removing -H 'Accept-Encoding: gzip, deflate, br'
from a cURL request and why gunzip
doesn't work sometimes (when?). If only provide gzip
as Accept-Encoding
, it always correctly return data compatible with gunzip
. It's a bit weird (maybe due to their relative overload) from YouTube API to not have a prefered compression method.
Accept-Encoding
documentation.
curl -v 'https://www.googleapis.com/youtube/v3/playlistItems?part=snippet,contentDetails,status&playlistId=UUAcAnMF0OrCtUep3Y4M-ZPw&maxResults=50&key=AIzaSy...'
Having 166,527 bytes of content according to ls -l
.
If add: -H 'Accept-Encoding: gzip, deflate, br' > a && gunzip -c a
:
Total packets Length
according to Wireshark for the Google API instance IP: 46640
Otherwise if add: > a && cat a
:
Total packets Length
according to Wireshark for the Google API instance IP: 187614
Total packets Length
according to Wireshark for the Google API instance IP: 187713
Executed twice to verify the order of magnitude.
The question is do the API file_get_contents
use compression? What are CPU overload of my instances to verify that it wouldn't be an unacceptable CPU overload.
I added to each crontab of official instances:
* * * * * (date && cat /proc/loadavg && cat /proc/meminfo | head -n 3) > health.txt
Could also investigate HTTP Range
header.
Could also propose a maxResults
and fields
parameter, as requested on Discord maxResults
and fields
. Here is another Discord user expecting maxResults
to work.
Note that concerning channels?part=community
it returns sometimes empty pages when using nextPageToken
, as the YouTube UI, however according to amatis on Matrix it may happen with no more data after so we could try to find an optimization fix to avoid making a few empty requests at the end.
Increase priority following this Discord message. Depending on the endpoint you are using there are maybe alternative webpages less bandwidth consuming to retrieve.
By the way
membership
:true
I adapted it to my site and it was much faster than yours. If you want it to be faster, instead of this URL:if ($options['membership']) { $result = getJSONFromHTML("[https://www.youtube.com/channel/$id"](https://www.youtube.com/channel/$id%22));
use this URL:
if ($options['membership']) { $result = getJSONFromHTML("[https://www.youtube.com/channel/$id/search"](https://www.youtube.com/channel/$id/search%22));
Because your URL: 880kb. My URL: 400kb. It can pull and query faster. 40% speed difference.
Source: private Discord message from (788496476187263026
)
To avoid making a first request to have a continuation token, being able to reverse-engineer this continuation token would improve performances, cf #190. This would possibly make a single case instead of a first HTML web-scraping and then JSON continuation.
Currently tests are parsed during production delivery...
Like in #258 can use browse
YouTube UI endpoint to only retrieve JSON and not HTML containing JSON.
Someone told me on Discord to only receive JSON thanks to YouTube UI browse
endpoint, this seems related to #252.
See YouTube Data API v3 optimizing performance documentation.
Are adding compression making sense, as it is included in apache and curl by default, isn't it?
May think about using compressed parameter to decrease server workload, but I don't think that it is worth it.
Related to #27 and #35.