1 TByte of seeding - Githubissues

synctext commented 11 years ago

Ability to seed 1 TByte of content using Tribler.

This requires announcing, say, 1000 swarms of content in the DHT. This is a problem, as shown here: http://blog.libtorrent.org/2012/01/seeding-a-million-torrents/ The announce interval needs to be prolonged in order to reduce DHT announce traffic. Perhaps the DHT cannot handle this and a new peer discovery method is required: #13.

As a quick partial fix we can put a cap on the maximum swarms to DHT announce. Then use a simple round-robin method to cycle slowly through all available swarms.

synctext commented 9 years ago

"Performance analysis of a Tor-like onion routing implementation", Quinten Stokkink, Harmjan Treep, http://arxiv.org/abs/1507.00245

@qstokkink student Wouter is aiming to repeat your work and do a CPU performance analysis automagically after each pull request.

Roadmap :-)

synctext commented 9 years ago

@Pathemeous Found interesting code from Quinten: https://github.com/Tribler/tribler/compare/devel...qstokkink:devel

qstokkink commented 9 years ago

If you want to re-use our code you should extract the start_profiling and stop_profiling functions from our code and call them respectively before and after whatever you want to profile (and make sure it's in the filter). The tunnel_piecharts.R script can then parse profiling files in this format. We integrated this into the Gumby pipeline with a script added to the .conf file.

Pathemeous commented 9 years ago

Thanks @qstokkink, I will have a look.

whirm commented 9 years ago

We do already have memory profiling and manhole (telnet into the process to inspect it in real time) It would be cool to have the profiling integrated into gumby's instrumentation.py so all the experiments can use it straight away.

synctext commented 8 years ago

A simple 1 week starting experiment without Tribler, Tor-stack, and no Gumby.

Goal is to first detect if there are Libtorrent bottlenecks. Experimental setup consists of just Libttorrent on Ubuntu. Libtorrent with seeding of 50GB..250GByte and testing the local download performance. With 1000 swarms or so, it is expected that Libtorrent grinds to a halt in normal settings.

Outcome: Spend 2 weeks to create graph to show performance development, as you're seeding more GBytes and swarms.

Experiment:

seed 10 GByte, 50 swarms
download for 1 minute
write down downloaded speed MBytes progress
Repeat: double amount of GBytes & swarms
create scalability graph: seeding size versus download speed
tweak until above works :-)
document after 2 weeks effort
Move to Gumby+Tribler_tor stack..

Pathemeous commented 8 years ago

https://github.com/Pathemeous/tribler

synctext commented 8 years ago

Progress: first script to seed and control Libtorrent from Python. Step closer to a PullRequest tester in Jenkins with 1 TByte seeding test. Ardhi has 3000+ Linux .iso torrents, seems sufficient for a future test. 20161017_142647

Measure in an easy to build setting the cost of 1 TByte seeding (or MaxHardDiskCapacity). Start Libtorrent for 1 hour with various seeding size settings and measure the total consumed bandwidth. Goal is to identify the overhead. For instance, set the seeding upload bandwidth to just 10 KByte or so. This need to be subtracted to obtain the DHT, PEX, and othe control overhead protocol traffic.

Docs: _The limits of the number of downloading and seeding torrents are controlled via active_downloads, active_seeds and active_limit in sessionsettings change the default of 5 active seed to 10000 :-)

Pathemeous commented 8 years ago

Graph plot of startup time (time since launch until first seeding occurs)
Graph (bar) plot of total amount of bandwidth used while seeding for 1 hour. Interesting result is the following:
Graph plot of the effective amount of bandwith used (total minus overhead of torrent management)

ardhipoetra commented 8 years ago

Some of the .torrent crawled can be found in my dropbox. Currently it has 3658 .torrents.

I'm crawling mininova now, but I got blocked so maybe it will take some time to get more torrents on this site. AFAIK mininova now hosts legal torrents only.

devos50 commented 8 years ago

@ardhipoetra great, thanks for sharing!

You might want to rate-limit your requests and maybe use HTTP proxies for the crawling process?

ardhipoetra commented 8 years ago

In the end, I used rate-limit my request and it works.

As @devos50 requested, here is the link to the collection, zipped. You can put that in the bbq.

As for the crawler, I made the repository on https://github.com/ardhipoetra/legal-torrent-crawler

egbertbouman commented 5 years ago

Currently, Tribler will create an introduction point for every torrent it is seeding. This could potentially overload the exit nodes, especially now that exit nodes are also running the PexCommunity. To minimize this problem it was thinking about somehow limiting the number of introduction points that we create. We could do this from the TriblerTunnelCommunity or maybe using libtorrent's auto-management feature (which allows us to limit the number of active seeds). Using auto-management seems to make the most sense. @qstokkink @devos50 What do you think?

qstokkink commented 5 years ago

Sure. Why not?

synctext commented 3 years ago

@drew2a Just a small reminder. Please setup a seedbox for a Tribler channel with lots of Creative Commons music. (e.g. different then superapp; overlap). Simple static dump. For demo purposes only. Show that we can seed lots of stuff: https://github.com/mdeff/fma

EDIT: then please use that to setup a demo channel with markdown and real content #3615

drew2a commented 3 years ago

For further development. An idea.

As I see, there are two types of Tribler's users:

Channel's creators (who wants to seed content by creating a channel)
Normal users (who wants to download)

So, what if we developed a tool that makes it easier to create and seed a channel?

Like:

$./create_and_seed.sh <folder>

Where is a folder with the following structure:

my channel
├ sub_directory
| ├ file1
| ├ file2
| └ README.md
├ sub_directory2
| ├ file3
| └ file4
└ README.md

The behavior:

Create a channel, using my channel as a channel name
Create a markdown preview for a folder in case of the presence of *.md-file in this folder.
Start to seed a content

@ichorid what do you think?

ichorid commented 3 years ago

@ichorid what do you think?

What are the fileX things? Torrents? Or actual files that should become individual torrents?

drew2a commented 3 years ago

What are the fileX things? Torrents? Or actual files that should become individual torrents?

Actual files (discussed offline).

drew2a commented 3 years ago

I did an experiment:

Have generated 1GB of data divided into 1024 torrents (generate_test_data.py)
Seed them (seeder.py)
Have downloaded3 different torrents (picked randomly) from another PC. All downloads have been completed within the range of [5..30] seconds.

No trackers were used. Libtorrent version: 1.2.10

drew2a commented 3 years ago

FMA test data were seeded for one month (1 channel, 156 torrents, 23 GB total). The music data are still available inside Tribler.

Tribler / tribler

1 TByte of seeding #21