Closed haidog-yaqub closed 10 months ago
Thank you for opening the issue.
Are you referring to the following line? https://github.com/MorenoLaQuatra/audiocaps-download/blob/dbe83b56ef97fc82143e78d99614ec30cb0135e7/audiocaps_download/Downloader.py#L295
How do you discovered the bug?
Thank you for opening the issue.
Are you referring to the following line?
How do you discovered the bug?
Yes, I checked the length of data downloaded by your repo and found some of them are very short, so I manually checked them and found the problem. I also tried yt-dlp in command line, but the issue still exists. It should be an yt-dlp bug.
The way I solved it is to download entire audio and then chunk it by other tools.
Thank you so much to raise the issue then. Can you suggest any other tool to use? Does downloading everything from yt-dlp and then manually cut (let's say, with torchaudio) solve the issue in your opinion?
Thank you so much to raise the issue then. Can you suggest any other tool to use? Does downloading everything from yt-dlp and then manually cut (let's say, with torchaudio) solve the issue in your opinion?
Yes, downloading entire audio and cutting it by torchaudio works.
Thank you again. Just to verify if I'm correctly identifying the problem, do you think this code will solve the issue?
# Download the file using yt-dlp
# os.system(f'yt-dlp -x --audio-format {self.format} --audio-quality {self.quality} --output "{target_file_path}" --postprocessor-args "-ss {start_seconds} -to {end_seconds}" https://www.youtube.com/watch?v={ytid}')
# Download the ENTIRE audio file
os.system(f'yt-dlp -x --audio-format {self.format} --audio-quality {self.quality} --output "{target_file_path}" https://www.youtube.com/watch?v={ytid}')
# now manually cut the audio file
try:
waveform, sample_rate = torchaudio.load(target_file_path)
waveform = waveform[:, int(start_seconds * sample_rate):int(end_seconds * sample_rate)]
torchaudio.save(target_file_path, waveform, sample_rate)
except Exception as e:
print('Error loading audio file: ', target_file_path)
print(e)
# delete file if it exists
if os.path.isfile(target_file_path):
# delete file
os.remove(target_file_path)
I did an automated check of a batch of 1000 files and the duration seems to be correct.
Thank you again. Just to verify if I'm correctly identifying the problem, do you think this code will solve the issue?
# Download the file using yt-dlp # os.system(f'yt-dlp -x --audio-format {self.format} --audio-quality {self.quality} --output "{target_file_path}" --postprocessor-args "-ss {start_seconds} -to {end_seconds}" https://www.youtube.com/watch?v={ytid}') # Download the ENTIRE audio file os.system(f'yt-dlp -x --audio-format {self.format} --audio-quality {self.quality} --output "{target_file_path}" https://www.youtube.com/watch?v={ytid}') # now manually cut the audio file try: waveform, sample_rate = torchaudio.load(target_file_path) waveform = waveform[:, int(start_seconds * sample_rate):int(end_seconds * sample_rate)] torchaudio.save(target_file_path, waveform, sample_rate) except Exception as e: print('Error loading audio file: ', target_file_path) print(e) # delete file if it exists if os.path.isfile(target_file_path): # delete file os.remove(target_file_path)
I did an automated check of a batch of 1000 files and the duration seems to be correct.
Yes, I think it should work. You can compare with previous data, especially those short ones.
I will close the issue, open it again if something else is missing.
Just found this download method could cause unalignment problem. For example, the start time in the meta is 5 second, but the actual downloaded audio starts from 10 second. This will also cause the length of some downloaded audio to be far less than 10 seconds. I suggest directly downloading the entire audio by yt_dlp and chunking desired clips locally.