yacy / yacy_search_server

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance
http://yacy.net
Other
3.42k stars 428 forks source link

Feature request: Add support to index peertube #399

Open ghost opened 3 years ago

ghost commented 3 years ago

Peertube has a pure an AngularJS frontend but that can't be indexed without rendering the page with the JS first. It does however have a comprehensive and easily accessible API. Framasoft's own public indexer (SepiaSearch)[sepiasearch.org/] uses that to provide a search service across all know instances. Yacy could use the same method to provide an integrated peertube search.

Maybe this could be introduced as a plugin so that instances can opt in to peertube indexing.

virtadpt commented 3 years ago

It is also possible to add the Media RSS or ATOM feeds for an instance to YaCy as a timed indexing job. For example, https://peertube.sunknudsen.com/feeds/videos.xml

ghost commented 3 years ago

Will YaCy discover that naturally or is that a manually enter address? Say YaCy just follows links and is allowed to crawl other domains, how will it discover that feed?

I pointed YaCy to https://tilvids.com and it stopped there without finding the feed.

virtadpt commented 3 years ago

YaCy does not automatically discover RSS feeds, but it is possible to add indexing tasks that pull a particular feed and index any new links present. I have a blog post that describes how to do this here: https://drwho.virtadpt.net/archive/2017-11-07/technomancer-tools-yacy/