district0x / district-proposals

Proposals for new districts to be built by the district0x Team.
https://vote.district0x.io/
212 stars 36 forks source link

DP #133: Find0x - A Decentralized, Community Controlled Search Engine #133

Open batmanscode opened 7 years ago

batmanscode commented 7 years ago

Name

Find0x

Purpose

A decentralized search engine that the everyone (i.e. The community) controls.

Major advantages over current systems:

  1. Open and transparent search and ranking algorithms
  2. Censorship resistant
  3. Rules can be voted on (like how much DNT a certain type of advertising should cost)
  4. Features will be created based on user proposals (new ideas can be proposed and voted on)
  5. Not controlled by a single entity

Description

My vision of this is for it to be better than google, more transparent and controlled by its users.

A web crawler can be used for indexing, and this info can be stored (on IPFS or Storj) and then relevant results returned when searched for.

Open source ranking and reputation algorithms will have to be created.

Advertising features like AdWords can be built in where DNT is charged and paid back as revenue to the district users, and a higher portion of DNT revenue paid to contributors.

A reporting system can be used to identify malicious sites.

Thank you for having a look at my idea :) I'm not too sure of the exact details on how search engines work, so this could be entirely infeasible, but in any case I'd love for there to be a good decentralized search engine!

Would really appreciate feedback.

Slack user: batman 0x682879D9d7DD1e3bBD3d77b9ac82e86F640e85a9

ghost commented 7 years ago

This is a very ambitious proposal and I like the idea of it. The feasibility of doing this decentralized will obviously be hard (like most projects), but why not start thinking about it! A couple of questions come to mind in terms of the technology needed to support this and I may comment again as I think of more.

  1. What would be the mechanism that serves up the search query/index? Currently ElasticSearch and Solr are leaders in this space and open source, but you would need something like this to at the very least serve the index's saved in IPFS or Storj and be able to serve it up to the end user via the front end.

  2. To provide the engine of the actual search query, would there need to be a system similar to Golem that allows nodes to serve up excess resources and get rewarded for allowing the network to execute queries through them?

  3. The matching and algorithm that something like Google has is always getting better and they are able to best serve results based on a users browsing history/account history. Perhaps this model would not take that into consideration but rather serve up what is best for most people.

  4. Do you envision this being more like Duck Duck Go, where no search histories are recorded and is geared towards anonymity or are you more interested in search results that the user would prefer (i.e. collecting search histories)

rongomaib commented 7 years ago

If everyones browsers had a plugin that would analyze the data and upload structured data to a datastore somewhere that could work.

Users would get free use of the search engine. Everything would get anonymised.

There would need to be some kinda whitelist/blacklist so peoples emails don't get added to the search data.

On 27 August 2017 at 03:08, basiccrypto notifications@github.com wrote:

This is a very ambitious proposal and I like the idea of it. The feasibility of doing this decentralized will obviously be hard (like most projects), but why not start thinking about it! A couple of questions come to mind in terms of the technology needed to support this and I may comment again as I think of more.

1.

What would be the mechanism that serves up the search query/index? Currently ElasticSearch and Solr are leaders in this space and open source, but you would need something like this to at the very least serve the index's saved in IPFS or Storj and be able to serve it up to the end user via the front end. 2.

To provide the engine of the actual search query, would there need to be a system similar to Golem that allows nodes to serve up excess resources and get rewarded for allowing the network to execute queries through them? 3.

The matching and algorithm that something like Google has is always getting better and they are able to best serve results based on a users browsing history/account history. Perhaps this model would not take that into consideration but rather serve up what is best for most people. 4.

Do you envision this being more like Duck Duck Go, where no search histories are recorded and is geared towards anonymity or are you more interested in search results that the user would prefer (i.e. collecting search histories)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/district0x/district-proposals/issues/133#issuecomment-325150585, or mute the thread https://github.com/notifications/unsubscribe-auth/Adg2Xexop9l8uvB_Z1RU8IwW_EF30kT_ks5scFgpgaJpZM4PDUUE .

batmanscode commented 7 years ago

Thank you @basiccrypto and yes exactly! This will definitely be hard to accomplish but as you said why not start thinking about it.

To answer your questions:

  1. Solr and ElasticSearch are both great but I'm not too sure how this can be integrated with a blockchain. Here is an interesting article for a search mechanism I came across: http://jonathanpatrick.me/blog/elastic-ethereum

  2. I hadn't really thought of that, thank you for bringing it up! That seems like a good idea, however wouldn't that mean a user has to pay a fee for every search?

  3. & 4. It could work both ways, I think that should be up to the individual user; users who want more personal results could maybe get that by using an account with a Civic identity or maybe even a plugin like @imdying mentioned. And the users who prefers not to have their search history analysed could simply not use it and receive results in a way similar to DuckDuckGo.

ghost commented 7 years ago

@imdying that is a really good idea, something like IOTA's mechanism of doing work for each transaction but in this case you are actually indexing the web as you browse the web and help build the search index at the same time.

@aaqilaziz really cool that someone is starting to think about search on the blockchain, that link is informative and makes the point that using Oracles to do this type of indexing might be necessary and is a pretty good idea!

For #2, yes, my initial thought would be you may have to pay a minimal fee to search but the idea presented above that allows you to use search for free while also helping index and ranking the sites you visit would be incredible!

Still a lot of work to flesh out on the technical side, from a district perspective it would also be really cool to partner with an add district to help drive traffic and revenue to the search engine. Do you think there would be a district coin?

nybblexyz commented 7 years ago

As in the advent of internet 1.0, there was a proliferation of human edited directories whose entries became a fertile source of content for search engines. Web rings also come to mind. The keyword portals became a direct navigation tool in finding info. I would like to submit a proposal until I saw this. Am not sure if it fits here but I'll give it a try. Small businesses still have difficulty understanding the decentralized world. Many are still living on cloud. What if there is a directory service that will bridge them from WWW to DWEB wherein they can have a business profile created (think neocities.org - reincarnation of geocities). This is basically in the same tract as how internet 1.0 was developed. But this time, there is an opportunity to develop it better with the use of internet of things domains (GTLD). What if the district offers curated directory service as members signup and automatically captures info onto their industry and can be seen in : .businesses.--- .products.--- .supplies.--- .services.--- .arts.--- .category.--- Leveraging the subdomain into its associated districts (ex. School.supplies.--- or insurance.services.--- or metal.arts.---) DNT token or any other token can be used. Will something like this help?

nybblexyz commented 7 years ago

I can expound more on the curation architecture.