mattwright324 / youtube-comment-suite

Download YouTube comments from numerous videos, playlists, and channels for archiving, general search, and showing activity.
MIT License
275 stars 46 forks source link

Explain (or let user decide) what happens when comment/video/channel gets deleted #112

Open page200 opened 2 years ago

page200 commented 2 years ago

Thank you for the wonderful program!

If it downloads a comment, and then the comment or video or channel gets deleted, and then the user clicks "Refresh" in order to download newer comments, what happens in the database with the comments that don't exist on YouTube anymore? Since one of the main purposes of the program is archiving, it should keep the previously downloaded comments (does it do that?) (or let the user decide), and maybe add a "deleted from YouTube" tag in the database.

On a similar note, what does the option "Overwrite comments and channels if already downloaded" do? Is this about re-downloading a comment whose text has been edited by the commenter? If overwriting channels is disabled, are newer comments not fetched?

On a similar note, if a video that is saved in the local database gets unlisted on YouTube, will clicking "Refresh" download its newest comments? That would be good (or let the user decide).

mattwright324 commented 2 years ago

Hi @page200,

Those are all good questions!

In the event a comment, video, or channel are deleted or become inaccessible any number of ways, yes, what you previously downloaded on a refresh will stay in the database. They will never disappear unless you were to clear the database, delete the data for the group(item) that the comments were associated to, or delete the entire database file.

It is not as efficient to determine the status of whether a comment is deleted or not. In some circumstances such as the video being privated, deleted, marked for kids, I could easily determine any previously downloaded comments for that particular video are no longer accessible. However, to determine the deleted or not status of the unknown potentially large number of comments a person may have downloaded that could use up too much API quota to be worth it.

When enabling the Overwrite comments and channels if already downloaded option, if the comment text, comment author's channel name, or comment author's profile image url changed from the last time it was downloaded they will be overwritten with the latest that you just grabbed. There is no history for what has been changed though.

As for unlisted, it depends. If you specified just a channel in the group it will grab and use only the currently public videos for that channel so it would stop tracking a now-unlisted video as it would no longer come back in the API for that channel. This seems to be an oversight and something I could add to the process, checking every id grabbed in the past for a channel and not just what comes back today. However, if you specify the specific video that is unlisted or a playlist that the unlisted video is in then it will still get its comments just fine.

page200 commented 2 years ago

what you previously downloaded on a refresh will stay in the database.

Perfect, thanks! Maybe the "Refresh" dialog should say something like "The Refresh will NOT propagate deletions of comments/videos/channels into the local database." or simply "The Refresh does NOT delete anything from the local database."

Maybe even a more appropriate word than "Refresh" can be found (inspired by what similar operations are called in DropSync or FreeFileSync or rsync). Maybe something like "Fetch more".

It is not as efficient to determine the status of whether a comment is deleted or not.

Due to the extra cost, as far as I'm concerned, determining that isn't necessary.

When enabling the Overwrite comments and channels if already downloaded option, if the comment text, comment author's channel name, or comment author's profile image url changed from the last time it was downloaded they will be overwritten with the latest that you just grabbed. There is no history for what has been changed though.

Maybe the text should say "commenters' profiles" or "commenters' channels" instead of "channels" to avoid confusion with the channel that posted the video.

checking every id grabbed in the past for a channel and not just what comes back today.

Sounds good. Low priority for me.