Closed Grodou closed 4 years ago
@Grodou Hi, great work! Thank you for contributing. I have few suggestions and questions:
YoutubeVideo
+VideoDetails
and YoutubePlaylist
+PlaylistDetails
is fine, but PlaylistVideo
(which extends AbstractVideoDetails
) is counterintuitive as for me. I don't know better naming for now, need to think, mb PlaylistItem
or PlaylistVideoDetails
X-YouTube-Client-Version", "2.20200720.00.02"
and do we need to update them from time to time, if yes where I should look for them?@sealedtx Hi, thank you for the review. Here come the answers:
PlaylistVideo
(which extendsAbstractVideoDetails
) is counterintuitive as for me. I don't know better naming for now, need to think, mbPlaylistItem
orPlaylistVideoDetails
Completely agree. I think PlaylistVideoDetails
is better.
Why do you follow C89 variable declaration style? It's 2020 now, you don't need to declare variables at the beggining of methods :)
Old school style I guess :)
Please explain what is "continuation" of playlist, does it mean that playlist is too big to display in a single webpage and we need to load more?
Exactly, 100 videos max. Scrolling to the bottom loads 100 more, and so on. All videos are directly loaded for now, a paging system could be handy.
What are those hardcoded values
X-YouTube-Client-Version"
,"2.20200720.00.02
" and do we need to update them from time to time, if yes where I should look for them?
Just a part of playlist.py. The actual client version we should use is available in the ytInitialData.responseContext
part of the response.
(for the record, it's also useful for faster downloads, maybe another PR)
For the "X-YouTube-Client-Name"
, I don't know, but both are mandatory.
What does "unstable" comment mean in your getVeryLongPlaylist?
That playlist is often updated, we can't do reliable tests on its length or content.
Still in progress, next step is to make a PlaylistVideo as easy to use as a YoutubeVideo (format selection, downloads, ...).
@Grodou Thank you for the answers.
Old school style I guess :)
Please, make sure you follow current code base style (even if it is not perfect), for easier maintenance in future.
The actual client version we should use is available in the ytInitialData.responseContext part of the response.
So mb we should implement getOrDefault logic for these values (search them in response, if not found - use default hardcoded)?
next step is to make a PlaylistVideo as easy to use as a YoutubeVideo (format selection, downloads, ...).
Not sure if you should focus on this. I believe that there is no simple way but just YoutubeDownloader.getVideo(id)
for all of items in playlist which will be an overhead if user want only metadata about playlist items. I think current api might be sufficient to get playlist info (vidoe elements with videoId/title/thumbnail) and later it is possible to get YoutubeVideo if you need to download only some of videos. Correct me if I am wrong.
@sealedtx Hi, here are some answers and suggestions.
Please, make sure you follow current code base style (even if it is not perfect), for easier maintenance in future.
It wasn't clear, but of course I will.
So mb we should implement getOrDefault logic for these values (search them in response, if not found - use default hardcoded)?
Looks like the client version is always in the response, no need for hardcoded value.
I think current api might be sufficient to get playlist info (vidoe elements with videoId/title/thumbnail) and later it is possible to get YoutubeVideo if you need to download only some of videos. Correct me if I am wrong.
I agree, it is sufficient. However, functionalities like these could be handy:
formats()
and subtitles()
methods, no overhead if you're only interested in metadatasplaylist.downloadByItag(File outDir, int itag)
, downloadVideosWithQuality(File outDir, VideoQuality videoQuality)
)I made a version of it, and since code explains it better than I could, I will soon commit so you can tell if it's interesting or not. And if not, I will of course take care of all the cleanings.
@sealedtx Here is the version with download available, main changes are listed below.
Video hierarchy
AbstractVideo<T extends AbstractDetails>
YoutubeVideo<VideoDetails>
PlaylistVideo<PlaylistVideoDetails>
PlaylistVideo
lazy loading: formats()
and subtitles()
trigger a load action.
We can force a single video to do so with PlaylistVideo.fetch()
.
We can force all videos to do so with Playlist.fecthVideos()
.
If an error occurs, formats and subtitles are empty, the initial exception is accessible via getFetchError()
Other ways to do:
YoutubeVideo
method signatures remain unchangedYoutubeException
, and change YoutubeVideo methodsExceptions
YoutubeException.DownloadUnavailableException
superclass
LiveVideoException
for live videosRestrictedVideoException
for playlist private and deleted videosTests
I noticed a strange behaviour: a video can have a specific available format at a time, and a few seconds later, it's gone. Unit tests take care of it and should success any time. I didn't find any stable playlist with deleted and/or private videos to test.
Let me know what you think.
@Grodou Hi, sorry for a long response, I feel busy last days. First of all thank you for a great work! And again suggestions:
YoutubePlaylist
:
I believe that it should contain only these methods:
details()
videos()
findVideoById()
findVideoByIndex()
(BTW: is it possible that video index is not same as list index? Or may be we can sort them and remove this method?)findVideos()
(I like your Predicate<> filter and findFormat()
looks cool too, but may be it should be findFormats()
and return list of formats, same as findVideos()
)fetchVideos()
(Not sure about this, because if one fetch fails it will skip subsequent ones)I hardly imagine that some one is looking for a videos in playlist only with specific itag or quality. Even so it could be achieved with findVideos()
.
I think that we should remove download
methods from YoutubePlaylist
, because it creates additional hierarchy: playlist->videos->formats.
I see user's steps as:
Moving last four items into a single one makes code difficult to read and user won't understand what is under the hood (additional http requets), IMHO.
PlaylistVideo
:
You should not keep reference to Parser
inside it. Better to pass it inside fetch(parser)
only once it is needed.
Storing fetchError
also looks not pretty. I think subtitles()
and formats()
should return empty list or throw runtime exception if not fetched yet, choose the one you like more. BTW: you may use Collections.emptyList()
instead of Collections.EMPTY_LIST
and it won't require @SuppressWarnings("unchecked")
.
PlaylistVideoDetails
:
Could you please explain what is isPlayable
and when it can be false
?
Packages:
Need to think about package refactoring. If we have playlist
package mb we should add video package and move there YoutubeVideo
and VideoDetails
.
Waiting for your comments and thoughts. Thank you!
@sealedtx Hi, thank you for the kind comments, and sorry it took me some time to answer.
Since it's a central part of this version, I think we should agree about it first.
You should not keep reference to
Parser
inside it. Better to pass it insidefetch(parser)
only once it is needed.
If that so, lazy loading is broken. And then:
YoutubeDownloader
playlistVideo.fetch(downloader.getParser())
becomes useless since we could just use downloader.getVideo()
PlaylistVideo
class can be removed, format selection and download features as wellIn my opinion, this widely used design pattern is completely adapted to this case: it loads data only when it's needed (if ever), and hides any underlying process.
Please feel free to tell if you don't want it, I'll keep the version in a branch anyway, just in case. Hope this helps to make a final choice about it, so we can get closer to a final version!
2 questions I can answer for now:
is it possible that video index is not same as list index? Or may be we can sort them and remove this method?
Video index starts at 1, not 0. Videos seem to be always sorted.
findVideoByIndex(index)
could simply return videos().get(index - 1)
(after a basic boundary check)
Could you please explain what is
isPlayable
and when it can befalse
?
A false
value means the video is either private or deleted
@Grodou Thank you for the answers.
I understand what is lazy loading pattern and why it might be applied here. But I I doubt about hiding from user additional http request, because I know that this library is used by android devs. Due to lazy loading it is not clear that formats()
or subtitles()
requires http request, as I know network request may freeze UI Thread on android or even throw an Exception (because it is not allowed to work with network on main thread). I need to think about this, give me few days, please.
findVideoByIndex(index) could simply return videos().get(index - 1) (after a basic boundary check)
I think in such case this method can be omitted.
Still waiting for your thoughts about shrinking YoutubePlaylist
api. Thank you.
@sealedtx Hi, thank you for the helpful informations.
I was wondering why you seemed both interested and reluctant, and just wanted to move on. Sorry about that, please take your time.
I think in such case this method can be omitted (
findVideoByIndex(index)
)
Totally agree.
Still waiting for your thoughts about shrinking
YoutubePlaylist
api
I wanted to know about lazy loading first, because there's a lot to say, as you can see. So let's assume playlist videos still have formats.
fetchVideos()
(Not sure about this, because if one fetch fails it will skip subsequent ones)
Just a poor implementation of mine, it should better catch exceptions and return a list of successfully fetched videos or at least the count of it.
may be it should be
findFormats()
and return list of formats, same asfindVideos()
video.findFormat()
is only there for playlist.download(outDir, filter)
video.findFormats()
could be useful indeed, but would not replace findFormat()
playlist.findFormats()
returning a set of formats available for all videos would be great! (see next answer)I hardly imagine that some one is looking for a videos in playlist only with specific itag or quality.
If you're interested by a whole music playlist so you want all the audio tracks, we could then assume that:
So you could either:
playlist.findFormats()
methodand download them all!
Even so it could be achieved with findVideos().
Absolutely, but then, you still have to loop over videos, select the same format each time, and download. The download feature is an utility, not an obligation. Any user will always be free to follow the 5 steps process you suggest.
2 more ideas for downloads.
We could watch progress (for an async task) with a listener like:
onDownloading(video, progress)
onFinished(video, file)
boolean onError(video, throwable)
: return true
only if you want to keep on downloadingWe could move the whole download logic to the Format
class since:
subtitles
package
A reference to the video details (or at least the title) would be mandatory for the file name.Need to think about package refactoring. If we have
playlist
package mb we should add video package and move thereYoutubeVideo
andVideoDetails
.
Indeed, a dedicated package for "regular" video would make things clearer.
By the way, I think Itag
and Extension
should better belong to the format
package.
That was a lot of topics, sorry for the longest comment ever :) Hope it will be helpful.
@Grodou Hi! Thank you for your response.
I really appreciate your work done, but unfortunately, I decided not to implement (or at least postpone it) lazy loading for playlist video formats, due to reasons mentioned above.
Just a poor implementation of mine
I prefer to keep simple things done well at first, and leave opportiunity to extend functionally later. If we don't know how it should be implemented to work well - remove it for now, and discuss/add in the further updates.
all videos have the same set of formats
It might be wrong. It is common, that different videos may have different set of formats (it may depend on the period when they were uploaded, quality of the source video, etc). And also youtube make changes from time to time.
The download feature is an utility, not an obligation. Any user will always be free to follow the 5 steps process you suggest.
User must be aware about what is happening using this library. As I already mentioned, merging these steps into one can lead to unclear situations, errors may occur during fetching meta, or accessing file system, or connection, or downloading. It would be difficult to build such api that properly handles all these possible errors in a single method. I believe user of this library must be responsible for it.
We could move the whole download logic to the Format class since:
Agree. But I don't want to do changes which break backwards compatibility. There is a lot of work to be done about refactoing, including classes naming, packages structure, shrinking useless methods, extending api, fixing api and code style. But I want it to be released in the 3.x.x version. I will create "refactoring" branch later. I like most of your ideas and you are welcomed to contribute!
By the way, I think Itag and Extension should better belong to the format package.
Itag - yes, but Extension contains also subtitles formats, not only video. As I said, not in this PR, sorry for starting discussion about refactoring here.
So... in conclusion... sorry for being not very clear: In this PR I want to see the most simple functionality for playlist meta extraction, but enough to be useful. It should cover the most common use cases, and it should be clear for user where and which errors may occur. We may discuss further improvements (but no breaking changes in version 2.x.x) after the feature will have already been released and useful for people.
sorry for the longest comment ever :)
Don't worry, my comment is also not very compact :)
@sealedtx Hi, that's good news! I think it's a very good decision.
So, in order to keep the API as clear and simple as possible, I think:
YoutubePlaylist
should only contain the details, and a list of PlaylistItem
PlaylistVideoDetails
should change to PlaylistItem
, keeping the hierarchy of AbstractDetails
(as you suggested in your first comment)PlaylistVideo
classLike this, things are clear, an item is not a video, and access is more simple: downloader.getVideo(item.videoId())
.
Tell me if that's OK for you.
@Grodou Sounds good. But let it be PlaylistVideoDetails
, to understand that this class contains fields of AbstractVideoDetails
@sealedtx OK, should the YoutubePlaylist
field be items
anyway ? or videos
?
@Grodou let it be videos
;)
@sealedtx Hi, it's done.
Just a few comments:
YoutubePlaylist.findVideos(filter)
still there, YoutubeVideo.findFormats(filter)
addedAbstractVideoDetails.checkDownload()
remains tooHope that's fine.
@Grodou Perfect! Is everything ready to merge?
@sealedtx Hi, I forgot one little line... Ready to merge now!
Probably not the right place to ask, but I'd like to make a PR for faster downloads. Is it relevant? Should I create an issue first?
@Grodou Thank you.
Is it relevant? Should I create an issue first?
For sure, and yes, it would be better to explain your thoughts in the issue.
Get playlist details and all videos. No download for now.