Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery
https://www.tribler.org
GNU General Public License v3.0
4.87k stars 451 forks source link

Fails building circuits with large number of anon seeds [still issue with 6.6.0-exp1] #1683

Closed colin1497 closed 6 years ago

colin1497 commented 9 years ago

Seeding >60 torrents anon, tribler sometimes fails to build circuits and seed. Lots of info in #1605, see this comment specifically:

https://github.com/Tribler/tribler/issues/1605#issuecomment-142086993

Splitting this issue out for tracking purposes, may be related to #1682

whirm commented 8 years ago

@synctext I guess this should be assigned to whichever student is in charge of improving the tunnel community?

synctext commented 8 years ago

OK, yes. Another nice student job.

So the tunnel community has scaling problems. Good to hear from our users about this. With a nice performance graph, it's even performance analysis & re-factoring.

synctext commented 8 years ago

@Pathemeous #21 Seems the 1TByte seeding goal is difficult. Seeding anonymously gives already errors at >60 torrents.

Pathemeous commented 8 years ago

Yes, this should be the first target to overcome (as expected). It seems that this is related to the big GUI refactor? Without a clean API such high-performance goals are bound to result in errors like these.

whirm commented 8 years ago

It's not related to it, but in principle we where planning to have it fixed for the wx3 release.

Depending on how long the tunnel community refactor will take, it could be that the improved tunnel community is not ready until after the new gui is ready, so maybe it's not worth worrying about breaking the gui parts of it for now.

whirm commented 8 years ago

If the WX3 part of this milestone is ready before the rest, we can still release that so we get in Debian/Ubuntu ASAP and split the rest for a future milestone/release.

lfdversluis commented 8 years ago

Starting to target this one as it's the last issue assigned to me for 6.6. What do I have to work with? @colin1497 I see now that it's been open a while, what do you remember; do you have any stacktrace or log related to this issue?

My first guess is that downloading or seeding 60 anon torrents means creating >= 60 hidden tunnels which is quite heavy on the cpu side. Having (blocking) python calls for 60 tunnels probably results in e.g. diffie hellman handshakes or intro-points timing out, which may be what is going on here. @whirm @synctext do you think this is a probable cause? Looking at the code there are many possible timeouts.

colin1497 commented 8 years ago

I'm afraid it's been a while. Looking back at #1605 I originally had 91 torrents, so it was well over 60. I haven't seen this is a long time. Besides build changes, I've also tripled my data rates with my ISP since I originally created the report. If your speculation is right, then you would hit it at some point. Maybe the issue is just that it shouldn't start all the tunnels in parallel? Maybe it should queue them up and start a max of 10 in parallel or something? Just thinking out loud.

whirm commented 8 years ago

Even if some requests time out it should still keep building circuits until it hits the circuit target.

lfdversluis commented 8 years ago

@whirm sure, but if due to a large amount of circuits being built not a single one can actually be constructed, rescheduling them all concurrently will mean that all the newly scheduled circuits will timeout as well. Assuming that this is the issue of course.

whirm commented 8 years ago

@lfdversluis let's stop guessing and try to reproduce it instead. Once you've got a scenario where this happens.

If you don't have a shitty Internet connection, use wondershaper to fake it :)

If you want to limit the amount of cores Tribler can use (this shouldn't make a huge difference) you can use taskset.

colin1497 commented 8 years ago

FYI, just updated to 6.6, 7cd6ed7402d22772e4a09c4520bec1b8553e1fc8 and it fails to build circuits. 103 seeded torrents.

Just looking at the Windows resource monitor it doesn't appear to be CPU bound. Resource monitor shows lots of disc activity on the mechanical drive where torrents are stored (60-100MB/s). Network activity rate is relatively low, well under 1Mbps.

On exit, I get a log file each time. I have diffed a couple of the log files and they are basically the same:

Tribler.exe.log.txt

Edit: After deleting my old tribler.conf file, it successfully built circuits and is now checking every one of the 103 torrents.

Edit2: Comparing the tribler.conf files, aside from the old one having some old options like t4t*, the big difference appears to be the "user_download_choice =" option with all the torrent hashes with "restartseed". I'm going to let it finish all these checks, shut down, and see what happens.

synctext commented 8 years ago

@colin1496 We still did not have a look at this, sorry. The credit rewards for seeding + credit mining have been our prime focus since Feb.. Once that is done, the anon tunnels will get full attention.

colin1497 commented 8 years ago

No problem, just trying to give as much info as possible. After checks were completed I restarted and again no joy with almost an identical log file.

lfdversluis commented 8 years ago

@colin1497 Thank you that is very valuable info. It seems that the IO is too heavy and probably completely blocking the twisted thread, most likely resulting in circuits timing out due to handshake failures and what not. I am currently in the process of making the IO non-blocking by pushing it out of the twisted thread in https://github.com/Tribler/dispersy/pull/481 but this migrating is still underway. After dispersy, Tribler is next including the tunnels.

lfdversluis commented 8 years ago

Hmm looking at the log file I see ImportError: No module named csv which should be shipped with Tribler.

File "twisted\internet\base.pyo", line 825, in runUntilCurrent

  File "Tribler\Core\APIImplementation\LaunchManyCore.pyo", line 486, in session_getstate_usercallback_target

  File "Tribler\Main\tribler_main.pyo", line 498, in sesscb_states_callback

  File "Tribler\Main\Dialogs\systray.pyo", line 40, in updateTooltip

exceptions.AttributeError: 'ABCTaskBarIcon' object has no attribute 'icon'

is wx related, we are moving to QT soon so that should be fixed soon.

File "twisted\internet\defer.pyo", line 150, in maybeDeferred

  File "Tribler\Core\Modules\versioncheck_manager.pyo", line 54, in check_new_version

  File "twisted\web\client.pyo", line 1594, in request

  File "twisted\web\client.pyo", line 1578, in _getEndpoint

  File "twisted\web\client.pyo", line 1454, in endpointForURI

  File "twisted\web\client.pyo", line 818, in raiseNotImplemented

exceptions.NotImplementedError: SSL support unavailable

means our version manager is broken? @devos50 what do you make of this?

colin1497 commented 8 years ago

I am relatively certain that I didn't get the log entries in the session where I deleted tribler.conf and it rechecked every file. I think that it's only happening when it never is able to build the circuits.

Edit: No - seems a clean install just starting tribler fdfd8db9ccccc1229bdf1be2b0908664f57613ad gives this log:

Tribler.exe.log.txt

whirm commented 8 years ago

@colin1497 you need to install python-openssl

whirm commented 8 years ago

@colin1497 if you are running from git, you should install all the dependencies listed on debian/control

colin1497 commented 8 years ago

Downloading Windows installer builds from Jenkins. I shouldn't have to separately install dependencies in that scenario, should I?

synctext commented 8 years ago

ah, the unchecked latest Windows builds. Fresh from Jenkins.Tribler.org then?

These are not often checked if they function OK. It would be good to check if this bleeding edge code, freshly installed can seed just one swarm correctly in Anon mode.

lfdversluis commented 8 years ago

@colin1497 The devel branch is almost exclusively used by developers that are adding additional dependencies (e.g. I am adding several at the moment). So often we add dependencies on our machines before we add them to the builders to check everything is working. The builders then ship these with the installers :)

As @synctext said, there are not regular checks on devel. Our next branch is far more stable, but we do not have any guarantees on this either. The only guarantee we do strive to deliver is that all dependencies are shipped with our installers (naturally). But if something is not working, do let us know so we can add it to our todo list.

colin1497 commented 8 years ago

Apologies guys, I had been pulling next branch builds previously, and had an issue in 6.5.2 and went to jenkins to grab latest build to see if same issue still existed. Geez, I can see that I clearly ended up grabbing devel branch versions. /facepalm

whirm commented 8 years ago

@colin1497 heh, no worries, at least we know it needs to be fixed now :)

@lfdversluis maybe this is due to the MSVC rebuild you did? Maybe you forgot something onthe python-openssl dll chain.

colin1497 commented 8 years ago

Quick update since there was concern about CPU performance:

I tried a few things. I watched CPU usage and it didn't seem that high, not even enough to force the CPU to peg to its max frequency. I set up an idle priority, 100% usage application and pegged it to one core to force the CPU frequency high. I set Tribler to "realtime" priority level. No change in behavior.

I can get 20 connected peers, but can't build circuits for my seeds.

Looking at network usage, it's really not that high -- never goes over 1Mbps. I have Gb infrastructure and 50Mbps connection to the internet.

Obviously that's all macro level.

synctext commented 8 years ago

@colin1497 You discovered a problem in the tunnel community. The team made a good performance measurement test. Even with light load the tunnel may take 3 minutes to build.

btw Gb infrastructure, nice!

colin1497 commented 8 years ago

Good to hear I found a legit issue.

WRT infrastructure, we completely renovated a house last year and it's relatively ridiculous what all I did....

colin1497 commented 6 years ago

Just wanted to say that that this remains a problem in the 7.0.2 release. I hadn't had an issue with it because:

1) I hadn't been in Tribler that much, and 2) At some point I lost my database and started clean with a lot fewer torrents, but

I'm up to the point where at startup basically everything just spins its wheels saying it's building circuits but none ever get going.

qstokkink commented 6 years ago

I'm assuming this to be fixed, but I'll add it to the 7.2 milestone for verification.

devos50 commented 6 years ago

I'm pretty sure this issue has been fixed. Closing the issue. Please let me know if there are any other problems related to circuit building.