josephrcox / easy-reddit-downloader

Simple headless Reddit post downloader
MIT License
70 stars 9 forks source link

MP4's are downloaded without any sound. #87

Open Cubres opened 5 months ago

Cubres commented 5 months ago

This tool is absolutely amazing - but I have a small issue with it - none of the MP4 videos (as well as any other video format I have tried) have any sound - they just have video without audio. I have tried to see if the specific Reddit post has audio and there wer posts who DID have audio, were kept on Reddit's servers, and got downloaded without any audio whatsoever. It woud be amazing if you could fix this issue, thanks :)

josephrcox commented 5 months ago

Hey! Thanks for this report. Can you link one or two that you aren't hearing audio on so we can investigate?

Cubres commented 5 months ago

Hello! I tried downloading the top posts from all time from r/sadposting (https://www.reddit.com/r/sadposting/top/?t=all), and, upon investigating, I saw that these videos do have sound on the original but don't when I download them. I am on a Mac.

-------- Оригинално писмо --------

От: Joseph Cox @.***

Относно: Re: [josephrcox/easy-reddit-downloader] MP4's are downloaded without any sound. (Issue #87)

До: josephrcox/easy-reddit-downloader

Изпратено на: 23.03.2024 18:47

Hey! Thanks for this report. Can you link one or two that you aren't hearing audio on so we can investigate?
— Reply to this email directly, view it on GitHub , or unsubscribe . You are receiving this because you authored the thread. Message ID: @ github . com>

------=_Part_1749_2082866192.1711638815637 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable

Hello!
I tried downloading the top posts from all time from = r/sadposting (https://www.reddit.com/r/sadposting/top/?t=3Dall), and, upon = investigating, I saw that these videos do have sound on the original but do= n't when I download them.
I am on a Mac.



>-------- =D0=9E=D1=80=D0=B8=D0=B3=D0=B8=D0=BD=D0=B0=D0=BB=D0=BD=D0=BE = =D0=BF=D0=B8=D1=81=D0=BC=D0=BE --------
>=D0=9E=D1=82: Joseph Cox ***@***.***
>=D0=9E=D1=82=D0=BD=D0=BE=D1=81=D0=BD=D0=BE: Re: [josephrcox/easy-r= eddit-downloader] MP4's are downloaded without any sound. (Issue #87)
>=D0=94=D0=BE: josephrcox/easy-reddit-downloader <easy-reddit-do= ***@***.***>
>=D0=98=D0=B7=D0=BF=D1=80=D0=B0=D1=82=D0=B5=D0=BD=D0=BE =D0=BD=D0= =B0: 23.03.2024 18:47

=20 =20

=20

Hey! Thanks for this report. Can you link one or two that you aren't h= earing audio on so we can investigate?

=20

=E2=80=94
Reply to this em= ail directly, view it on GitHub, or unsubscribe.
You are rec= eiving this because you authored the thread.3D""Mess= age ID: <josephrcox/easy-reddit-downloader/issues/87/2016544474@github.com>

=20 =20

------=_Part_1749_2082866192.1711638815637--

josephrcox commented 5 months ago

@Cubres So i did a bit of digging and found that the core problem is that Reddit's public API only gives us the MP4 link without audio. I couldn't find a way to get the audio version, and this is likely something proprietary that we will struggle to solve for.

This is a similar problem to the #31 problem, where Reddit uses very interesting tech on the frontend to stop users from scraping all of the needed data for archiving.

For YouTube video downloads, we use a tool called FFMPEG which works great, but it doesn't work for fetching Reddit video links unfortunately.

This needs a more long-term fix by hopefully someone who has some interesting workaround that I haven't found yet.

Cubres commented 5 months ago

Thank you for the quick response :)I hope that somebody finds a way to fix this issue, but it really seems that the Reddit API only provides the video files. I too haven’t found any solution to this yet other than manually downloading the files.-------- Оригинално писмо -------- Oт: Joseph Cox @.) Относно: Re: [josephrcox/easy-reddit-downloader] MP4's are downloaded without any sound. (Issue #87) До: "josephrcox/easy-reddit-downloader" @.), Cubres @.), Mention @.) Изпратено на: 2024-03-28 17:56:00

@Cubres So i did a bit of digging and found that the core problem is that Reddit's public API only gives us the MP4 link without audio. I couldn't find a way to get the audio version, and this is likely something proprietary that we will struggle to solve for. This is a similar problem to the #31 problem, where Reddit uses very interesting tech on the frontend to stop users from scraping all of the needed data for archiving. For YouTube video downloads, we use a tool called FFMPEG which works great, but it doesn't work for fetching Reddit video links unfortunately. This needs a more long-term fix by hopefully someone who has some interesting workaround that I haven't found yet. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

git9875 commented 1 month ago

I think I might be onto a solution here. I put this URL into the download_post_list.txt file and I logged the downloadURL. https://www.reddit.com/r/nextfuckinglevel/comments/fyy4l1/the_greatest_shot_in_television_ever/ downloadURL = https://v.redd.it/p8mc0yjzu4s41/DASH_1080?source=fallback

Then I altered the URL to: https://v.redd.it/p8mc0yjzu4s41/audio It produces the audio-only output and it has the content-type: video/mp4 And once both of those are downloaded, they can be merged using ffmpeg() like on line # 976 for YouTube videos.

git9875 commented 1 month ago

Here is the fix that worked for me. I'm only using the download_post_list.txt file, not sub-reddits. There seems to be a problem with downloadMediaFile() calling checkIfDone(postName) twice.

// export FFMPEG_PATH="/path/to/ffmpeg"

// in downloadPost(post)
if (fileType == 'mp4') {
    downloadAndMergeVideoFiles(downloadUrl, filePath, post.name);
} else {
    downloadMediaFile(downloadUrl, `${downloadDirectory}/${filePath}`, post.name);
}

// ...

async function downloadAndMergeVideoFiles(videoUrl, videoFileName, postName) {
    const videoFilePath = `${downloadDirectory}/${videoFileName}`;
    const videoDownload = downloadMediaFile(videoUrl, videoFilePath, postName);

    audioUrl = videoUrl.substring(0, videoUrl.lastIndexOf('/')+1) + 'audio';
    audioFileName = videoFileName.replace('.mp4', '-audio.mp4');
    const audioFilePath = `${downloadDirectory}/${audioFileName}`;
    const audioDownload = downloadMediaFile(audioUrl, audioFilePath, postName);

    await Promise.all([videoDownload, audioDownload]);

    mergedFileName = videoFileName.replace('.mp4', '-merged.mp4');
    const mergedFilePath = `${downloadDirectory}/${mergedFileName}`;
    log(`merging audio and video into:  ${mergedFileName}`, false);

    // Merge audio and video using ffmpeg
    ffmpeg()
        .input(videoFilePath)
        .input(audioFilePath)
        .output(mergedFilePath)
        .on('end', () => {
            console.log('Download complete');
            // Remove temporary audio and video files
            fs.unlinkSync(audioFilePath);
            fs.unlinkSync(videoFilePath);
        })
        .run();
}