HelloZeroNet / ZeroNet

ZeroNet - Decentralized websites using Bitcoin crypto and BitTorrent network
https://zeronet.io
Other
18.33k stars 2.27k forks source link

Important questions (security, performance, sustainability, ...) #772

Closed dumblob closed 7 years ago

dumblob commented 7 years ago

I'm aware of the statement, that ZeroNet is not a replacement for the current client\<->server based model. But independently of it, I'd like to ask the following questions as an immediate reaction to an article about ZeroNet I read. I consider these questions very important for basically any to-be-successfull P2P project (they might find their place in FAQ or even directly in documentation).

  1. How is the locality of data ensured (in parallel to their non-local persistence to avoid their disappearance)? It's nonsensical to have everything everywhere (like torrent) and forever. Except for that, based on measurements of big companies (Google, Facebook, Twitter, etc.), most of the web is served to mobile or embedded devices and these usually employ just tiny and often volatile storage.

  2. Is it possible to locally dynamically filter data (e.g. someone travels and wants to be sure, that in the country she/he is currently, noone will find any persisted data which do not comply with the local legislation). At least it shall be possible to easily manually tag content which must not be persisted.

  3. Is the content securely and smartly (e.g. ZeroNet measures statistics about network congestion, considers constraints set up by the administrator - e.g. maximum size of the storage, maximum bandwidth used for up/down, etc.) cached to minimize overall pressure and to increase availability and stability? Or is it just a naive implementation of the torrent protocol (i.e. opening of thousands of connections one by one, and thus effectively DDoSing small SOHO devices with NATs, which count up to the vast majority of all leaf nodes of the internet network infrastructure)?

  4. Does it work stably and without any issues absolutely without ani existing trackers? This is closely related to the first question as uTP, PEX, DHT, Local Peer Discovery, etc. technologies for finding peers are easily blockable, which is also what happens in reality.

  5. Is it possible to download all content in logical blocks (e.g. script, image, HTML, ...) to allow prioritization, parallelism, partial downloads and partial viewing (including the wishes of the user - e.g. I want to block all pictures, so I will decide to not load them until I enter the "album" section of the web)? Or is it rather a "git-like" monolithic design, where everything is divided into same-sized blocks mutually absolutely unrelated and these are first then downloaded in parallel from different peers (which has by the way the consequence of long waiting times for the whole content to load without the possibility of at least a preview/shortened_version/smaller_version, and at the same time is highly prone to starvation as each block is requested only from one or just very few seeds which are of course blocked by state censorship)?

  6. How is data reuse ensured? And how about multicast data streaming? It's a core feature for big networks, streaming (audio, video, real-time games, ...) and things like jQuery D3.js NODE.js or Video.js rely on well-behaving and high-throughput CDNs.

  7. Is there any full-featured active scraper (possibly distributed across ZeroNet to avoid it's disappearance or unavailability in censored countries) of all the content in ZeroNet? This is an absolute must-have and a strong requirement for massive spreading of the network. An index engine using e.g. the Namecoin database might be a good start.

  8. What everything prevents direct use (i.e. without any architectural changes) of existing rich web applications (full of JS, server-client communication, etc.) in ZeroNet? Could you list all the points?

  9. How about fine-grain unlimited control over user access? Because all data are replicated, everybody can read them, but in case only part of them is public and the rest is specific for a certain group of people, how to make absolutely sure noone else can read it? Do I really need to encrypt the whole page/application (with it's whole history - ouch, I already feel the size pain) and build my own sign-in solution which will then decrypt only certain parts of the downloaded data based on the access-rules built in the page/application? How about the current bottle neck of one and only one site-owner managing the access rights (currently it seems only full rw access can be granted) to the site-owner's page/application (I want to make sure my forum will stay fully active and functional even after the state starts to censor my forum and the forum users)?

  10. How does one sets up a secure fully functional bidirectional (to and from the "usual" internet) gateway for the case I want to access ZeroNet from a device not having a ZeroNet client (which actually currently approaches 100% of all devices and will probably stay so for a long time)? Anywhere any very detailed howto?

  11. How does ZeroNet fights against peers and seeds appearing in different blacklists from the "usual" internet?

HelloZeroNet commented 7 years ago
  1. If you visit a site it can download up to 10MB of data that required by the site (source code, database) If you want to have more data than that, then the site owner able to define optional files that only downloaded when a client requests it. Using the merger sites you can also split the database into multiple sites (eg. by based on category, date, language, etc.), so you will only download and receive updates for data you are interested in. The example for this is the Hubs for ZeroMe.
  2. There is no content censorship filter and I don't see any easy solution for that. Who going to decide what content is problematic?
  3. Sites have size limit, the connections has CPU time, re-connection thottle and content updates are also limited, but probably there is lots of attack vectors on that. (especially hard on Tor network where every connection is comes from 127.0.0.1)
  4. Peers are stored locally, so the next time you fire up your client can connect to them (and query other IPs via PEX) without any trackers. Local peer discovery and DHT planned later.
  5. The content download is proritized by type (html, js first), freshness (newest posts first) and based on browser GET requests. If the images marked as optional files, then it will not download until you request it.
  6. There is no data re-use yet. Latest web framework produces a single build.js files that has everything packed-in, which makes it impossible to share between sites. And having a central site of js liblaries also against the decentralization idea.
  7. Not that I know.
  8. There is no backed, so you can't render html, but if you have a single page application, that has it's storage layer well separated, then I think its possible to re-use. Basically you can execute database queries directly from javascript and you have to keep in mind, that every data is public.
  9. Currently you have to encrypt the parts you want to keep in secret. You can give write permissions based on site directories, but nothing more.
  10. Basically you have to start your client with --ui_ip "*", then it will be accessible to anyone on your public ip. (enabling Multiuser plugin also recommended)
  11. You can use Tor to hide your IP

I hope this answers your questions :)

dumblob commented 7 years ago

Thanks for quick answers @HelloZeroNet .

  1. The question was rather about pruning and trying to avoid full duplication of the "whole internet" (all ever visited sites - i.e. thousands and more at least) and also the question asked about data locality (I don't want to download the content from someone from Europe if I'm sitting in Australia and few hundreds of Australians already visited the original web site from Europe).

  2. In many legislations, there are e.g. black lists of web sites with censored content or any other identifiers technically describing the censored content (so to answer your question - the "who" is the state). Should there be any trial, all the data will be either easily accessed (because of saved passwords) or decrypted (because of weak ciphering, some mistake or just because of access to enormous or breakthrough computing resources) or because of torture or whatever, it's way better to not have these data persisted at all. These countries include Germany (huge audio and video prohibition - just try any German VPN on youtube), Czech republic (online hazard games), USA (various...), Australia, China, Turkey, Sveden, ...

  3. Any plans or ways to improve it? Just react briefly as I know this fine-tuning is tedious and a lot of work.

  4. Great to hear there are some plans!

  5. Are all these logical pieces downloaded from the same source or from different ones? If from different ones, how is starvation prevented?

  6. It's not at all about having centralized data. It's about very smart redistribution of common data, so that it's locally (e.g. from the same country or state) very quickly accessible. This can be achieved through data flow monitoring in the whole network and automated redistribution, through some ranking, through smart caching, etc. Is anything for data reuse on the schedule?

  7. That's sad - noone will know about any content and noone will get interested except for those few with external strong motivation (usually money - i.e. black market).

  8. Any plans on providing such "backend" (I can imagine a faster variant of the universal Ethereum computing platform)?

  9. Any plans to extend it for more fine-grained control? Are actually already currently supported roles (i.e. groups of users having the same permissions)?

  10. Could this public gateway be easily protected with a password? I don't want everyone using my gateway :wink:.

  11. Ok, in case Tor is blocked (e.g. Syria, Turkey, etc.), is there any other way how to "hide" yourself?

HelloZeroNet commented 7 years ago
  1. ZeroNet sites are designed to run locally. If you don't have the data you can't query/search in it. Currently the peers picked randomly.
  2. It could be possible to add a plugin that displays a warning if you going to visit or already seeding a site that many user find problematic.
  3. Sometime I add new limitations, but it's impossible to add efficient protection against a botnet with 1000 of machines. (or like I said on the tor network where there is no IP adresses)
  4. .
  5. Parts downloaded from different peers. When you visit a site it starts up 10 workers and each begin download different files. If they run out of task (<10 task remain) after some timeout (based on filesize) they start to download the same files using multiple peers and the first one win.
  6. I don't see it as an important problem, so it's not planned yet.
  7. .
  8. No plans for backed, for security reasons every logic should run in the browser.
  9. No plans for more detailed permissions yet.
  10. Yes, you can enable the UiPassword plugin for that
  11. You can use VPN and there is other ways to get around the block (see: Tor transport protocols)
dumblob commented 7 years ago

Ok, now it's clear to me what is the direction of ZeroNet. I must admit I'm a bit disappointed, but on the other hand I know very well how difficult it is to build a well-behaving P2P network serving for universal purposes.

By the way, could you please edit your last comment and add some foo strings to the points on which didn't comment as otherwise the numbering does not match :wink: (Markdown is number-agnostic)?

HelloZeroNet commented 7 years ago

Yes, it's still very new and limited, but I think we should search for use-cases where it could work instead of focusing on what is it not good for.

Thanks for the questions (added the dots)

dumblob commented 7 years ago

but I think we should search for use-cases where it could work instead of focusing on what is it not good for.

Of course. I'll keep an eye on ZeroNet and it's use. Keep going!