Closed in4sec-org closed 3 months ago
Same:
(most recent call last):
...
File "/function/code/apps/bot_tg_potyk.py", line 112, in download_yt_as_mp3
yt.streams.filter(only_audio=True).order_by("abr").first().stream_to_buffer(buffer)
^^^^^^^^^^
File "/function/code/pytubefix/__main__.py", line 564, in streams
self.check_availability()
File "/function/code/pytubefix/__main__.py", line 324, in check_availability
raise exceptions.VideoPrivate(video_id=self.video_id)
pytubefix.exceptions.VideoPrivate: [91mSpH83KzVKDc is a private video [0m
Btw original pytube doesn't work as well https://github.com/pytube/pytube/issues/1973
try this version to see what error appears pytubefix==6.3.4rc1
try this version to see what error appears pytubefix==6.3.4rc1
(most recent call last):
...
File "/function/code/apps/bot_tg_potyk.py", line 112, in download_yt_as_mp3
yt.streams.filter(only_audio=True).order_by("abr").first().stream_to_buffer(buffer)
^^^^^^^^^^
File "/function/code/pytubefix/__main__.py", line 534, in streams
self.check_availability()
File "/function/code/pytubefix/__main__.py", line 305, in check_availability
raise exceptions.LoginRequired(video_id=self.video_id)
pytubefix.exceptions.LoginRequired: [91mSpH83KzVKDc requires login to view [0m
From my tests, it doesn't seem to be a problem with the library, it seems to be something with the video itself
Well it happens with any video like for that video https://www.youtube.com/watch?v=K4TOrB7at0Y:
(most recent call last):
...t
File "/function/code/apps/bot_tg_potyk.py", line 112, in download_yt_as_mp3
yt.streams.filter(only_audio=True).order_by("abr").first().stream_to_buffer(buffer)
^^^^^^^^^^
File "/function/code/pytubefix/__main__.py", line 534, in streams
self.check_availability()
File "/function/code/pytubefix/__main__.py", line 305, in check_availability
raise exceptions.LoginRequired(video_id=self.video_id)
pytubefix.exceptions.LoginRequired: [91mK4TOrB7at0Y requires login to view [0m
Note that 91mK4TOrB7at0Y is not actual video id. This string is like fixed for various videos
Nevermind 91m
is like reserved string: pytubefix.colors.Color.RED
Could you provide me with the complete code you are testing?
from io import BytesIO
from pytubefix import YouTube
buffer = BytesIO()
(
YouTube("https://www.youtube.com/watch?v=K4TOrB7at0Y")
.streams
.filter(only_audio=True)
.order_by("abr")
.first()
.stream_to_buffer(buffer)
)
print(buffer.tell())
Well, the code works in the local environment, but it doesn't work in the cloud, such as Digital Ocean or Yandex Cloud
YouTube may be blocking your remote IP, try using use_oauth
and let us know the results.
Note: Don't use your main account to authenticate, YouTube may ban it.
Well, the code works in the local environment, but it doesn't work in the cloud, such as Digital Ocean or Yandex Cloud
This may be the problem, as it is a cloud server there is no way of knowing the type of network infrastructure they have, there may be something in them that could cause blocking, due to their network infrastructure
req.txt:
git+https://github.com/JuanBindez/pytubefix
code:
...
from pytubefix import YouTube
...
@app.task(name='transcribe_youtube_link', bind=True)
def transcribe_youtube_link(self, user_id, youtube_link, language='', isSpeakerDetectionEnabled=False, speakerMode='auto', speakerRange=None):
runpod.api_key = os.getenv('RUNPOD_KEY')
endpoint = runpod.Endpoint(os.getenv('RUNPOD_ENDPOINT'))
task_id = str(self.request.id)
mp4_file = f"static/{task_id}.mp4"
new_file = f"static/{task_id}.wav"
local_url = f"{os.getenv('BACKEND_PUBLIC_URL')}/video/{task_id}.mp4"
# id: str, user_id: str, youtube_url: str, language: str = "English", is_speaker_detection_enabled: bool = False, speaker_mode: str = "auto", speaker_range: dict = {"min": 2, "max": 5}
crud_create_task(id=task_id, user_id=user_id, youtube_url=local_url, language=language, is_speaker_detection_enabled=isSpeakerDetectionEnabled, speaker_mode=speakerMode, speaker_range=speakerRange)
self.update_state(state='Creating')
print(f"Parameters received: language={language}, isSpeakerDetectionEnabled={isSpeakerDetectionEnabled}, speakerMode={speakerMode}, speakerRange={speakerRange}")
youtube_name = ''
try:
yt = YouTube(youtube_link)
#print(yt)
#full_seconds = yt.length
#youtube_name = yt.title
audio = yt.streams.filter(file_extension='mp4').first()
if not audio:
raise Exception("No suitable audio stream found.")
status = 'Downloading'
result = ''
is_premium = crud_is_user_premium(user_id)
#seconds = full_seconds if is_premium else min(full_seconds, 300)
# task_id: str, status: str, result: str, seconds: int, youtube_name: str, updated_at: datetime, chunks=None, language: str = "English", is_speaker_detection_enabled: bool = False, speaker_mode: str = "auto", speaker_range: dict = {"min": 2, "max": 5}
crud_update_task(task_id=task_id, status=status, result=result, seconds=0, youtube_name=youtube_name, updated_at=datetime.utcnow(), chunks={})
self.update_state(state=status)
out_file = audio.download()
full_seconds = ffmpeg_probe_length(out_file)
seconds = full_seconds if is_premium else min(full_seconds, 300)
youtube_name = yt.title
crud_update_task(task_id=task_id, status='Converting', result=result, seconds=seconds, youtube_name=youtube_name, updated_at=datetime.utcnow(), chunks={})
self.update_state(state='Converting')
# ffmpeg command to convert to low-quality mp4
command = [
'ffmpeg', '-i', out_file,
'-b:v', '500k', '-s', '640x360', '-preset', 'fast', '-threads', '0'
]
if not is_premium and full_seconds > 300:
command.extend(['-t', '300']) # Limit the output duration to 300 seconds
command.append(mp4_file)
subprocess.run(command, check=True)
os.remove(out_file)
# ffmpeg command to convert mp4 to wav
command = [
'ffmpeg', '-i', mp4_file, '-vn', '-acodec', 'pcm_s16le', '-ac', '1', '-ar', '16000', '-threads', '0', new_file
]
subprocess.run(command, check=True)
file_url = f"{os.getenv('BACKEND_PUBLIC_URL')}/{new_file}"
crud_update_task(task_id=task_id, status='Transcribing', result=result, seconds=seconds, youtube_name=youtube_name, updated_at=datetime.utcnow(), chunks={})
self.update_state(state='Transcribing')
batch_size = 3 if not isSpeakerDetectionEnabled else 3
run_request = endpoint.run({
"input": {
"audio": file_url,
"batch_size": batch_size,
"chunk_length": 30,
"language": language if language else '',
"diarise_audio": isSpeakerDetectionEnabled,
"speaker_mode": speakerMode,
"speaker_range": speakerRange
}
})
result = run_request.output(timeout=60*60*3)
os.remove(new_file)
chunks = convert_transcript_format(result)
chunks = json.dumps(chunks)
crud_update_task(task_id=task_id, status='Finished', result=result["text"], seconds=seconds, youtube_name=youtube_name, updated_at=datetime.utcnow(), chunks=chunks)
self.update_state(state='Finished')
crud_update_user_seconds(user_id=user_id, seconds=seconds, updated_at=datetime.utcnow())
return result
except Exception as e:
#youtube_name = ''
crud_update_task(task_id=task_id, status='Error', result=str(e), seconds=0, youtube_name=youtube_name, updated_at=datetime.utcnow(), chunks={})
self.update_state(state='Error')
raise e
log:
worker_1 | [2024-07-19 13:51:48,071: INFO/MainProcess] Task transcribe_youtube_link[6db8bd6b-8181-4fb4-a472-15f80beed1db] received
backend_1 | INFO: 172.26.0.8:59440 - "POST /api/task HTTP/1.1" 201 Created
worker_1 | [2024-07-19 13:51:48,155: WARNING/ForkPoolWorker-1] Parameters received: language=English, isSpeakerDetectionEnabled=False, speakerMode=auto, speakerRange={'min': 3, 'max
': 5}
worker_1 | [2024-07-19 13:51:48,268: ERROR/ForkPoolWorker-1] Task transcribe_youtube_link[6db8bd6b-8181-4fb4-a472-15f80beed1db] raised unexpected: VideoPrivate('\x1b[91mbXzTXD_OJo0
is a private video\x1b[0m')
worker_1 | Traceback (most recent call last):
worker_1 | File "/usr/local/lib/python3.10/dist-packages/celery/app/trace.py", line 451, in trace_task
worker_1 | R = retval = fun(*args, **kwargs)
worker_1 | File "/usr/local/lib/python3.10/dist-packages/celery/app/trace.py", line 734, in __protected_call__
worker_1 | return self.run(*args, **kwargs)
worker_1 | File "/usr/src/app/tasks.py", line 190, in transcribe_youtube_link
worker_1 | raise e
worker_1 | File "/usr/src/app/tasks.py", line 120, in transcribe_youtube_link
worker_1 | audio = yt.streams.filter(file_extension='mp4').first()
worker_1 | File "/usr/local/lib/python3.10/dist-packages/pytubefix/__main__.py", line 564, in streams
worker_1 | self.check_availability()
worker_1 | File "/usr/local/lib/python3.10/dist-packages/pytubefix/__main__.py", line 324, in check_availability
worker_1 | raise exceptions.VideoPrivate(video_id=self.video_id)
worker_1 | pytubefix.exceptions.VideoPrivate: bXzTXD_OJo0 is a private video
I tried with this link https://www.youtube.com/watch?v=bXzTXD_OJo0 just now.
Yesterday and a couple of months before that, it worked like clockwork (today I already made changes so that it would not take the name and duration from the library before the jump). I also tried moving the server from the UK to the US and Singapore (and other IPs), same result, and it broke in the last 24 hours. Also tried with other links, also not working today. It didn't work with the release before either.
Well, the code works in the local environment, but it doesn't work in the cloud, such as Digital Ocean or Yandex Cloud
try to do like @felipeucelli, but if it doesn't work I advise you to try to investigate their infrastructure from the inside to try to understand what is causing it.
See pytubefix.exceptions.LoginRequired: bXzTXD_OJo0 requires login to view
after trying with pytubefix==6.3.4rc1. Now I’ll try the option with authorization, as advised above.
Tryed rc version and says f14EJhG3X68 requires login to view, but I check in incognito mode without any user and video works
I solved the problem with: yt = YouTube(youtube_link, use_oauth=True, allow_oauth_cache=True)
.
In my case, I had a deployment via docker-compose with celery inside it and before starting I had to manually go into the required container and execute this code, then link my not very necessary gmail account (pretending to be a smart TV), only after that it worked.
The essence of linking is that the code will try to interrupt the input and wait for enter (in the case of the celery worker this is a dead number), you need to open the URL from the console or log in any browser (ok from another gadget and country) and enter the code from the log, and then Select the Google account to link, only then press enter. The code saves these credentials somewhere and this is not necessary on the second attempt.
Thank you all for your help, I’m ready to tell you my experience in more detail if anyone has similar problems.
@felipeucelli @in4sec-org You were right. The problem was resolved when using oauth credentials. Thanks!
Can someone show all the code with the auth too please?
@celarain just pass use_oauth=True
option like so:
YouTube(
"https://www.youtube.com/watch?v=K4TOrB7at0Y",
use_oauth=True,
)
You will be asked to authorize with google after running the code
I did that, then appear a url to auth my device, I did it and then gives me the error:
Error downloading video: EOF when reading a line
Well, check if tokens.json
file was created in pytubefix
installation directory and it's contain actual tokens e.g. venv/Lib/site-packages/pytubefix/__cache__/tokens.json
Send bad error report:
Please open https://www.google.com/device and input code FPA-HRL-WTR Press enter when you have completed this step.2024-07-19 15:35:08,294 - app.video_downloader - ERROR - Error downloading video: EOF when reading a line
The problem is that I am connecting but if I try it again, it asks for the auth again
Saw this folder but there is no tokens.json
/var/www/html/socialmedia/venv/lib/python3.12/site-packages/pytubefix/pycache
And there is no cache folder there
Maybe your directory is not writable, but it's definitely not the problem with the lib.
You can checkout the library code it should write tokens.json
file to pytubefix/__cache__
dir:
https://github.com/JuanBindez/pytubefix/blob/7e081b733074a7c030fdc18ea57ae5fb8c04ff17/pytubefix/innertube.py#L380C4-L394C1
see the permissions of /var/www/ ll command and you may have to use chmod to change them
I am using pytubefix==6.3.4rc1 versión, which one should I use?
this same one, from what I saw everyone has writing permission
the console log says after logging, then press continue, but I am reading this log with sudo journalctl -u socialmedia.service -f -n 30
And accepting permissions from external site, is that the problem? I have no idea how to do it in my external server everything
YouTube may be blocking your remote IP, try using
use_oauth
and let us know the results.Note: Don't use your main account to authenticate, YouTube may ban it.
If the current IP is banned by Youtube, then it is possible to try pytubefix with a proxy (to get a different IP).
@celarain, try to call authorization in ipython, if you want to run your code in a container, then you need to do it once in the container. This is a one-time procedure, you just need a script or an interpreter that can wait for the enter input. He will ask you to go to the URL (you don’t have to do it from this IP) and enter the one-time code from the console there + select your account to link. After this, your normal code on this IP will work.
Thanks, will try it later!
Hello All,
I got the same error, the code is working fine locally but when I deployed my code on heroku it game this error. I tried using use_oauth as well but it does not seems to be working.
I am using pytubefix==6.8.1. I tried downgrading the version as well as upgrading to the latest one but the error is still not fixed.
Any help is truly appreciated.
is this issue fixed?
Fixed the login required issue for an AWS EC2 deployed application.
The problem was that it asked for authentication in EC2 after deployment when use_oauth=True was set, as mentioned by @felipeucelli and @potykion. Locally, it worked perfectly.
My local environment:
OS : ubuntu 22.04
Python : 3.10
pytubefix : 6.6.2
The solution as below:
venv/lib/python3.10/site-packages/pytubefix/__cache__/tokens.json
venv/lib/python3.10/site-packages/pytubefix
directory within your project.) inside the "pytubefix" directory and create a file called "tokens.json" (
touch tokens.json`) inside the "__cache" directorynano tokens.json
to open the file, paste the code, save it, and exitNow the issue is fixed, and videos are being extracted from YouTube without the “Login required” or “Private video” issues
But the problem is that it needs to update the tokens.json file when the OAuth token expires. Is there an alternative way of doing that?
@SubodaDabarera
But the problem is that it needs to update the tokens.json file when the OAuth token expires. Is there an alternative way of doing that?
YouTube is working tirelessly to block third-party apps. Update to the latest version of pytubefix and try:
You can create a custom function that works best in your environment and pass it using oauth_verifier
, see #190.
You can also try to pass the PoToken which is valid for several days, see #209.
You can also try using a proxy to change your IP.
Describe the bug I see
pytubefix.exceptions.VideoPrivate: SOME_ID is a private video
for all video from Digital Ocean Infra from this night. These are regular publicly available videos and not streams or live videos.Desktop (please complete the following information):
Additional context I have a small pet project, before I downloaded a maximum of 3-5 videos per day.