boramalper / magnetico

Autonomous (self-hosted) BitTorrent DHT search engine suite.
http://labs.boramalper.org/magnetico/
GNU Affero General Public License v3.0
3.05k stars 341 forks source link

Centralized server proposal #65

Open ngosang opened 7 years ago

ngosang commented 7 years ago

For me it does not make much sense that everyone tries to get all the torrents of the web on their own. Therefore I propose a centralized server where a group of people can collaborate.

New magnetico module

ad-m commented 7 years ago

It overrides the concepts of standalone (autonomous) search engine. P2P network is strong, thanks to decentralization. Free market of tracker, torrent sites, etc.

Maybe we can also track malicious or malfunctioning peer:

2017-05-01 08:09:30,673  DEBUG     Malicious or malfunctioning peer 14.45.170.217:6881 tried send above 10485760 max metadata size
2017-05-01 08:19:16,734  DEBUG     Malicious or malfunctioning peer 14.45.170.217:6881 tried send above 10485760 max metadata size
2017-05-01 08:29:39,000  DEBUG     Malicious or malfunctioning peer 78.15.81.106:6884 tried send above 10485760 max metadata size
2017-05-01 08:45:26,196  DEBUG     Malicious or malfunctioning peer 92.177.127.112:6881 tried send above 10485760 max metadata size
2017-05-01 08:48:18,732  DEBUG     Malicious or malfunctioning peer 88.183.124.100:6881 tried send above 10485760 max metadata size
2017-05-01 08:52:32,627  DEBUG     Malicious or malfunctioning peer 119.202.126.1:6881 tried send above 10485760 max metadata size
2017-05-01 08:59:03,319  DEBUG     Malicious or malfunctioning peer 86.167.90.242:6881 tried send above 10485760 max metadata size
2017-05-01 09:10:59,713  DEBUG     Malicious or malfunctioning peer 177.157.30.164:6881 tried send above 10485760 max metadata size
2017-05-01 09:16:41,543  DEBUG     Malicious or malfunctioning peer 176.120.174.164:6881 tried send above 10485760 max metadata size
2017-05-01 09:19:11,818  DEBUG     Malicious or malfunctioning peer 69.253.39.249:6881 tried send above 10485760 max metadata size
boramalper commented 7 years ago

First and foremost, all the changes you are suggesting, I think, needs to be implemented only in the web module (magneticow) and magneticod can keep operating without any changes.

That said, I am personally against creating one centralized entity, BUT it would be nice to be able to set up hubs where people can gather around, and due to the decentralized nature of magnetico every one can start one.

This, as you just said, requires some very significant changes in the web interface, and some small changes in magneticod (such as PostgreSQL support). I may implement it, but it will not be in my priority list until I get magnetico suite to the version 1. Besides, the advanced web interface that you just described will be separate from magneticow, as I would like to keep it for individual use cases.

Again, that said, if you (and others) are willing to undertake the task, I will be more than happy to help until my finals are over, and then I can even contribute myself as well!

Join the gitter.im channel to keep in touch! https://gitter.im/magnetico-dev/magnetico-dev

ngosang commented 7 years ago

It overrides the concepts of standalone (autonomous) search engine. P2P network is strong, thanks to decentralization. Free market of tracker, torrent sites, etc.

I'm proposing additional features, the current behavior of magnetico will not be changed.

Maybe we can also track malicious or malfunctioning peer

Sure we can.

First and foremost, all the changes you are suggesting, I think, needs to be implemented only in the web module (magneticow) and magneticod can keep operating without any changes.

I was thinking in adding a third module (i.e. magneticoS), for set-up the global server. The standalone user don't have to use this module.

That said, I am personally against creating one centralized entity, BUT it would be nice to be able to set up hubs where people can gather around, and due to the decentralized nature of magnetico every one can start one.

Would be nice to have a private server with 10 friends and have indexed 100 million torrents like https://btdb.in/ I doubt you can do it alone.

This, as you just said, requires some very significant changes in the web interface, and some small changes in magneticod (such as PostgreSQL support).

You don't need big changes in magneticod, just add few lines in persistence.py file to send metadata to the server API. PostgreSQL support will be implemented only in magnetico server module and maybe it's not necessary in first versions.

boramalper commented 7 years ago

You don't need big changes in magneticod, just add few lines in persistence.py file to send metadata to the server API.

I don't get it. Do you want magneticod to communicate (i.e. send messages about the discovered metadata) with "magneticos" you proposed? What is exactly the purpose of magneticos? Is it a much advanced web-interface with RESTful API support etc. or something else?

You can answer the questions above individually, but I think a detailed proposal about how things fit altogether in magnetico suite would be much better, and then we can start talking about more practical stuff that can make progress. I think the idea is good, and I'm sure there are people who would love to use it, but we need to have a clear goal first and a technical plan to implement it.

Feel free to join the gitter channel for questions and chat!

ngosang commented 7 years ago

2017-05-01 20_15_20-dibujo2 - visio profesional

I don't get it. Do you want magneticod to communicate (i.e. send messages about the discovered metadata) with "magneticos" you proposed?

Yes but the communication is only in one way (magneticoD to magneticoS), one HTTP POST for each torrent metadata. Each magneticoD keeps a local database just at it does now, but also sent discovered torrents to a central server. You only have to add 10 lines in persistence.py and a global constant to make it optional and disabled by default.

What is exactly the purpose of magneticos? Is it a much advanced web-interface with RESTful API support etc. or something else?

It's a web interface to manage data shared by all clients (just a fork of magneticoW). I prefer a different module because most people don't want to set up a sever, just run it standalone. To keep things simple, the search function, torrent list and database (sqlite) could be the same as magneticoW. We need to add an additional POST method to receive the torrent metadata sent by magneticoD, and one more column in torrent database to store the "user/client" which sent the metadata for statistical info.

ad-m commented 7 years ago

@ngosang , we really need centralization and push-model? Maybe we can extend magneticoW to provide REST API used by to provide unified search result by magneticoS? One magneticoW will be used by multiple magneticoS. This ensures the ability to disperse responsibility, which will reduce the legal risk. Attacks on publicly visible services will not endanger the data. Any magneticoS-operator can trust and query any magneticoW server.

untitled diagram

Create a magneticoS will be high profit service due high traffic & ads and no content creation cost. Create a magneticoW allows you to study DHT network.

ngosang commented 7 years ago

we really need centralization and push-model?

No, but has a lot of advantages. Most important for me: 1) Search speed 2) Users don't need to be online all day, magneticoS already has a complete DB 3) Users can download full DB from magneticoS and use it locally 4) MagneticoS doesn't need to know magneticoW IP for each client, IPs change... 5) An DDoS attack in magneticoS will be propagated to all users 6) You can have more statistics in magneticoS about users and torrents 7) As you know all torrents for each user you can detect/avoid fake torrents and rank results by number of users (verifications). Yes I'm thinking in the future.

This ensures the ability to disperse responsibility, which will reduce the legal risk.

We are not doing anything illegal, in both cases the main responsible always will be the operator of magneticoS which is providing a public service. Users are never reachable from Internet.

Any magneticoS-operator can trust and query any magneticoW server.

You can't trust all magneticoW servers... see point 7)

Create a magneticoS will be high profit service due high traffic & ads and no content creation cost. Create a magneticoW allows you to study DHT network.

This is not my objective and I doubt anyone can earn enough money with this... I want to have a search engine with the most complete information possible for search torrents and for study the DHT network. Working with a team (or alone with several magneticoD) you can crawl the DHT network much faster. In mi vision magneticoS won't be public, will be private site with login/password for registered users. If anyone want to make changes and make a public site it's his responsibility.

ad-m commented 7 years ago

@ngosang , I suppose you did not understand the proposed model. Suggests the following roles in the network:

In proposed model website user has no risk. MagneticoS operator has high risk due public visible service eg. DDoS, legal action (copyrights trolls want shutdown public visible service). The author may, for example, display ads, which can compensate for the risk. MagneticoW has low risk and information.

In my opinion, new torrent / DHT services should inherit good DHT / torrent features, including resistance to targeted destabilization.

Hellowlol commented 7 years ago

I think a import function would be nice, there some publiv dht dumps daily.

heresjonny1 commented 7 years ago

The ability to import or share existing dht dumps might be cool, but I think a centralized server defeats the whole purpose of this project by creating a single point of failure. The way it is now is much better.

itdaniher commented 7 years ago

Alternatively, we could build out functionality that distributes magnet links as a torrent, offering a 'decentralised' way of aggregating the work of other instances of magnetico.

ghost commented 7 years ago

I say we build a second DHT. Results that have some keyword in them (either selected at random or picked by the user) end up sticky. When you search you search both your own set plus the servers returned by the DHT attributed to keywords you are looking for.

Discovery of a torrent which matches a keyword you searched for before is refered to the folks on the DHT because you already have their addresses cached.

This assumes of course that discovered torrents have an experation, hense the sticky operation has some effect.

I think it would be good to let people either run it in non-distributed mode and distributed mode because the behaviors of the clients would be so different. We clearly don't want to interupt its current function.

A way to do it randomly would be to give each user a DHT id and keywords are then hashed to the same bite size and compared to the id. A distance < some amount makes it stick. To avoid keyword abuse the closeness needed may increase when many keywords are tested for. Then you would not maintain a DHT of keyword trackers but just IDs.