NickvisionApps / Parabolic

Download web video and audio
https://flathub.org/apps/details/org.nickvision.tubeconverter
GNU General Public License v3.0
1.04k stars 56 forks source link

Resume Partial Downloads on Retry #828

Closed rugk closed 1 month ago

rugk commented 2 months ago

Use case

I am in a train wifi/mobile network downloading videos in Germany, so obviously very unstable internet. Just had the issue the download stopped at 99%, broke and I had to re-download everything again.

Solution

Let me resume my download.

AFAIK other downloaders like Firefox Video download helper (add-on/browser extension) can do this.

DaPigGuy commented 1 month ago

I don't believe any work is needed for this. yt-dlp has a --continue argument which resumes partial downloads and is enabled by default. However, the --force-overwrites argument which Parabolic enables by default passes the --no-continue argument. After disabling the "Overwrite Existing Files" option and restarting a nearly complete download, it appears to correctly resume and quickly finishes downloading.

image

nlogozzo commented 1 month ago

Yeah however we wouldn't want users to turn off "Overwrite Existing Files" just to be able to resume downloads.

Maybe Parabolic can detect a .part file with the same download name and if it exists don't pass the overwrite files that way it resumes instead...

nlogozzo commented 1 month ago

Maybe Parabolic can detect a .part file with the same download name and if it exists don't pass the overwrite files that way it resumes instead...

Implemented this.

DaPigGuy commented 1 month ago

Maybe Parabolic can detect a .part file with the same download name and if it exists don't pass the overwrite files that way it resumes instead...

Implemented this.

The file path being checked is not always correct. For example, if the format being downloaded is an mp4, the part file would be [save file name].f[format id from yt-dlp --list-formats].mp4.part and there are probably some other cases outside of YouTube

nlogozzo commented 1 month ago

So maybe a better algorithm would be:

for file in folder:
    if file.extension == ".part" and file.startsWith(saveFileName):
        shouldResume = true;
        break;
DaPigGuy commented 1 month ago

Not a big fan of searching entire directories.

Just checked Soundcloud / Bandcamp / Vimeo which seem to match YouTube, so checking for

should catch hopefully all cases

nlogozzo commented 1 month ago

Okay sounds good. For now this should work...until #857 where the extension won't always be known again 🙃

DaPigGuy commented 1 month ago

To clarify, the [extension] is the extension of the format being downloaded (webm or mp4 for YouTube) from the service, so the extension selected by the user is irrelevant and shouldn't be an issue :slightly_smiling_face:

nlogozzo commented 1 month ago

is the extension of the format being downloaded from the service

Ah well that doesn't get saved anywhere...

nlogozzo commented 1 month ago

And we'd need to do 2 checks for services like Vimeo where both an audio and video format can be downloaded separately and then merged like YouTube

nlogozzo commented 1 month ago

I think this will simply do the trick and be fast enough:

//Check if a download should be resumed
bool shouldResume{ false };
for(const std::filesystem::directory_entry& entry : std::filesystem::directory_iterator(m_saveFolder))
{
    if(entry.path().extension() == ".part" && entry.path().filename().string().find(m_saveFilename) == 0)
    {
        shouldResume = true;
        break;
    }
}
DaPigGuy commented 1 month ago

And we'd need to do 2 checks for services like Vimeo where both an audio and video format can be downloaded separately and then merged like YouTube

This seems like a better approach still than searching the entire directory. Extension should be easy to track in the Format model

nlogozzo commented 1 month ago

Extension should be easy to track in the Format model

True.

But what about when a format is not selected, i.e. Best Quality for video or audio. Then we have no format information what so ever because it is left to yt-dlp to decide the format to pick....

DaPigGuy commented 1 month ago

Extension should be easy to track in the Format model

True.

But what about when a format is not selected, i.e. Best Quality for video or audio. Then we have no format information what so ever because it is left to yt-dlp to decide the format to pick....

Ah, that's unfortunate. As an alternative to scanning all files in a directory which might be problematic for large folders+old hardware, you could instead check if the part files for any possible format exists, but this would require passing all formats into the DownloaderOption model, and only as a fallback if the format is not explicitly set.

nlogozzo commented 1 month ago

Hmmm querying std::filesystem::exists for 20 times for 20 different formats or searching through the directory at the end of the day...they are both slow :(

I wish yt-dlp had a way to name part files or just stayed consistent...without needed all the extra format information.

nlogozzo commented 1 month ago

Ah, that's unfortunate. As an alternative to scanning all files in a directory which might be problematic for large folders+old hardware, you could instead check if the part files for any possible format exists, but this would require passing all formats into the DownloaderOption model, and only as a fallback if the format is not explicitly set.

I will sleep on this tonight and see what I think in the morning

nlogozzo commented 1 month ago

Okay so I will now store the extension in the Format model.

And thus I've implemented the check as such:

//Check if a download should be resumed
bool shouldResume{ false };
std::vector<std::string> potentialFilenames;
potentialFilenames.reserve(3);
potentialFilenames.push_back(m_saveFilename + ".part");
if(m_videoFormat)
{
    potentialFilenames.push_back(m_saveFilename + ".f" + m_videoFormat->getId() + "." + m_videoFormat->getExtension() + ".part");
}
if(m_audioFormat)
{
    potentialFilenames.push_back(m_saveFilename + ".f" + m_audioFormat->getId() + "." + m_audioFormat->getExtension() + ".part");
}
for(const std::string& filename : potentialFilenames)
{
    if(std::filesystem::exists(m_saveFolder / filename))
    {
        shouldResume = true;
        break;
    }
}

Last thing to do is find a way to get all Format models for a download to DownloadOptions for it to search those as well in the case where a format is not explicitly selected by the user.

nlogozzo commented 1 month ago

@DaPigGuy Okay I will now save the available formats as well in the DownloadOptions.

This is the final result of the check implementation:

bool DownloadOptions::shouldDownloadResume() const
{
    if(std::filesystem::exists(m_saveFolder / (m_saveFilename + ".part")))
    {
        return true;
    }
    for(const Format& format : m_availableFormats)
    {
        if(std::filesystem::exists(m_saveFolder / (m_saveFilename + ".f" + format.getId() + "." + format.getExtension() + ".part")))
        {
            return true;
        }
    }
    return false;
}

No need to check m_videoFormat and m_audioFormat again because they will be included in m_availabelFormats.

nlogozzo commented 1 month ago

V2024.10.2 is here with a fix for this issue!