Open devos50 opened 3 years ago
An insightful metric would be the time between requesting channel content and its REST response. I notice a high variance in response times, e.g., sometimes it takes >10 seconds to load additional results in a channel whereas other times it is nearly instant.
Can we replicate a user with 250+ downloads inside the tester? One issue reports problem:
Issue of getting so many swarms! They should be in different states: completed, seeding, stopped, looking for metadata. Here is a scrape of 300+ music albums with Creative Commons copyright. Free to re-use. We also need simple Debian and Ubuntu swarms. Quick and dirty, should take less than 1 hour with this.
How to measure the performance of our bootstrap process? It can't be campured in a single number. Its about what the users sees. Ideally: we take screenshots of the current GUI at fixed intervals (500ms, 1s, 1.5s, 2, etc.). That shows the progress and performance in easy to understand pictures. Understand the progressive loading progress + debug infrastructure for performance regression in future.
EDIT: also aim to reproduce the running out of sockets on Mac in future sprint (hard task).
Today we have lot of outstanding bugs in Sentry. How to devide our limited time between Sentry and tooling such as application tester. These are actual bugs experienced by actual Tribler users. The Application tester is more for edge cases. We try to make a specific unit test when we see bugs. Its too hard and too costly in developer time to also put those into the application tester.
Could we have an application test in which 2 Tribler cores are running. One is doing a remote search and displaying results. The speed of the search without any Internet connectivity is measured. Incrementally improve for performance regression and fancy pictures in our test of the GUI??? First step: profile our search queries first.
Yet another idea to distract us from fixing real reported bugs in Tribler :-) We have consistent reports of user telling us that "tribler is sloooow" and uses lots of resources #6919 . (https://tweakers.net/downloads/61752/tribler-7121.html) We could make a "Tribler performance penalty monitor". This downloads a given magnet link with Libtorrent and also downloads this with Tribler afterwards. This second download should be just as fast or even faster (on a different port). The ratio between the two is our performance penalty. Due to Libtorrent misconfigurations, Python, event loop blocking UDP sockets, or probably some reason we missed it could be slow. Probably its not about UDP packets spamming our ports to deliberately slowing us down. Or a smarter breakdown:
We have done lots of end-2-end testing work: https://jenkins-ci.tribler.org/job/tunnel_experiments/job/speed_test_e2e/ and https://jenkins-ci.tribler.org/job/validation_experiments/job/validation_experiment_hidden_seeding/ listed in various 2016 issues: Understanding the impact of latency on Libtorrent thoughput plus end-to-end anonymous seeding and download performance test. We dont have a central testing facility, its scattered and unmaintained on Jenkins. Lots of hard work already done here: https://jenkins-ci.tribler.org/job/validation_experiments/job/validation_experiment_libtorrent_compatibility/ (SOCKS5 logic mostly)
Yet another another idea for the application tester: testing query performance with 12 hard-coded test queries
Yet another improvement to the performance of Tribler. The "one second logger". Detect blocking or infinite running tasks of the core: https://github.com/kozlovsky/Tribler/tree/slow_coro_detection This is all part of the required tooling to identify and understand performance issues. Related also: getting a good traceback.
General problem Tribler won't start
. All sorts of different faults during startup will cause this error. We need more detailed logging. Somewhat unrelated: restarting GUI, old core still stuck in background. However, I'm quite hesitant to redo all this work and make the GUI multithreaded QT again. After we have our first million users please. Then we focus on perfection.
Unrelated discussion: why have 2 databases and potential deadlocks? Investigate using a single monolith database. Merge the new KnowledgeDB thing.
I renamed this issue to better capture the remaining work.
Our application tester can test a deployed version of the Tribler software by running random commands and monitoring stability. After merging #6206, this application tester has been left in a broken state.
The first step is to fix the application tester such that it runs again on our infrastructure. Then, it would be great to have a GitHub command (e.g., "app test 1 hour") to run it from a GitHub PR. This command will download the PR code, create a Tribler installation file, install Tribler on the machines, and run it for a specified duration.
Jenkins jobs to install Tribler Jenkins jobs to run the application tester