Closed MarvAmBass closed 7 years ago
Sounds good. We'll think about that.
Why not use youtube-dl?
youtube-dl alone would not provide all information we'd need.
But there is something else. Since the YoutubeService is written in Java, and all later Services will also be written in Java, we could only provide the Service/Extractor library for Java, not for other languages. Therefore if we wrote our backed in a language like js or python, witch could easily be used inside other languages, we could even make our backend language independent.
I personally think adding cross-language support is an unnecessary complexity.
It would require (to my knowledge) a cross-language data exchange format, such as Google's Protocol Buffers. JSON could work, but would be slower to parse. Plus maintaining code written in arbitrary languages will require more work than a series of site parsers written in the same language. Also, what is Android's Python support like?
Having said that, leveraging youtube-dl's codebase directly (and piggybacking off their extensive site support) sounds very enticing. Although if they don't provide all the video metadata we need, that wouldn't work.
I don't see any advantages in multi language support - I don't even get why somebody would suggest something like this for a maintainable project :stuck_out_tongue_winking_eye:
why is there a need for any description language? I just meant to extract the youtube and future media steaming services parts out of this app/project to it's own projects like youtube-webcrawl-api, vimeo-webcrawl-api.
The benefit of this would be, that I'm sure you aren't the only one needing some crawler like this as a java library. So other people using it will contribute to for example the youtube-webcrawl-api.
Webcawlers can be pretty hard to maintain when youtube makes some changes in their site. So this would make the whole project more stable without binding to much human ressources to the crawlers.
well XML isn't a programming language, its more a fully bloated markup language :smile: on the other hand, you are supposed to use these xml files for the GUI (coding the gui in java gets pretty ugly) and the sdk/libraries are taking care of all the business logic behind it.
implementing such kind of business logic is a pretty hard and ugly job. And yeah you're absolutely right - it would be crazy :tongue:
to code crawlers it's good to have a very very small code basis. And it's good to use a versatile webclient.
Actually I like the 2 Youtube Crawling classes here. They are clean, readable and do only whats needed for the job.
Other good things for Crawling:
Java -> HTMLUnit Python Perl WWW Mechanize
I really like the idea of a YouTube (and others) site scraping API as a separate project.
<cynicism>
However, the nature of the Android ecosystem being what it is, I imagine a dedicated maintainer (or maintainers) would have their hard work regularly stolen and a cheap UI slapped on top, for a quick profit on Google Play.
</cynicism>
On the plus side, as long as it was kept very small, I can't see this being too difficult to do. Depending on how regularly YouTube (and everyone else) make breaking changes to their site(s).
okay I bet you're right about the stealing - but if it's open source it will help many other open source projects.
On the other hand, maybe the most useful application will be made by a thief :smile:
What is missing from youtube-dl? Maybe they might add it
Well youtube-dl does only extract the stream url of a given video link. However it does not provide a search engine, it can't handle playlists and channels. So we'd still need to implement these things for each service if we really wanted to use youtube-dl. And for the extractor part alone which youtube-dl provides I think it's just to much work porting it over to android.
Actually youtube-dl does support download of playlists. Used this myself the other day. But as said before, trying to execute Python on android would be a major hassle.
We can probably look to their work for techniques on scraping from YouTube for some stuff, maybe. Although i think we've already got most hard stuff implemented, thank you @thescrabi :)
Their code might contain some pointers for other sites, when that comes up.
Actually I wrote NewPipe by locking at the source code of youtube-dl xD.
Hey there,
I've just flown over your code and I've noticed that the whole Youtube Parsing stuff happens in this project.
Why don't we transfer this into it's own Library. So it would be much better maintainable, and there might me other users who use it and help keeping it up to date. The same goes for all the other supported Media sites you planned to support.
It would be easy to extend the features of each of these libraries to parse channels, playlists, search and filter stuff and this app only uses these libraries as API for each video service.
what do you think about it?