Closed oskarth closed 4 years ago
^ fyi @status-im/status-core @rachelhamlin @cammellos @jakubgs
@oskarth which version of the app are you running? could you please add it to the description
@cammellos done
Side note: I also searched for 'bandwidth' in open issues and couldn't find a relevant one, which is a bit surprising given that it's a very common user complaint, anecdotally.
It's actually on my mind, but not captured, you're right @oskarth. To justify that slightly, we haven't had mental bandwidth to focus on much outside of SNT utility, multi-account, keycard and bugs this year—until now. So bandwidth will be a topic during our Oct 15 planning session (discuss post TK today).
User feedback not making its way into concrete problem descriptions?
The issue of capturing user feedback is something that I very much hope to prioritize now that @andremedeiros is coming onboard to help with the dev process.
b) bandaid to help with adoption and traction of Status the app, w/o as strong metadata/decentralization guarantees, a la Infura-for-chat (basically what we have already with mailserver)
What kind of sacrifice are we willing to make here? Let's discuss in janitors.
It's actually on my mind, but not captured, you're right @oskarth. To justify that slightly, we haven't had mental bandwidth to focus on much outside of SNT utility, multi-account, keycard and bugs this year—until now. So bandwidth will be a topic during our Oct 15 planning session (discuss post TK today).
Yeah that's fair, I think it's a larger systemic issue though, as user's feedback doesn't make its way to GHIs. Perhaps cause it is too intimidating? Or they give feedback in other forums and then there's a lack of follow up? Something re community link missing here, not quite sure what. cc @jonathanbarker @j-zerah FYI.
What kind of sacrifice are we willing to make here? Let's discuss in janitors.
I'd like this to be an open discussion, but we can bring it up there as well.
There's also to note that the version used for bandwidth is currently still listening to the old shared topic, which will be disabled for v1. From the bandwidth tests https://docs.google.com/spreadsheets/d/13kffxZaPnvULoy5Qh5sZSI2551KusCLJWUcdhKyobkE/edit#gid=0 , that version is ~6 times more bandwidth hungry than v1 (15 MB vs 94), although the benchmark are to be taken with a pinch of salt. Currently working in having them automated, so we can better tune them and record them.
Currently working in having them automated, so we can better tune them and record them.
Commendable effort, @cammellos! What does this involve and how hard would it be to get this to run as part of the test suite?
Thought on the UI side: We already implemented 'fetch messages' to allow for more user controlled bandwidth use. Could expand this to channels (cc @errorists ). That is after exploring other options to save bandwidth while retaining all functionality.
Regarding feedbacak will check in #statusphere / ambassadors channel to see if they reecognize the issue. As reliability and notifications, which are crucial for on the go use, have developed a bad rep it could also be that the majority of our user base is relying on wifi/in home experimentation with Status. Just a theory.
@andremedeiros We are going only to test status-protocol-go, as automating the testing of status-react is much harder (so we won't be testing mailserver interactions until that code is ported).
The strategy that I am following is to have two clients (or more, for now just two), interact with each other for a specified amount of time/messages. Both client will be dockerized and run through docker-compose, at the end of the tests metrics for each container can be collected with docker stats.
We can probably easily get it in the test suite (status-protocol-go), the only dependencies would be docker/compose and golang, it's would take some more time to make it a red/green test (it's more of a benchmark), as well as we don't have isolated network conditions for now, so it also depends on the overall traffic of the network, but we can take that into account when measuring.
That makes perfect sense, @cammellos. Thank you.
Or they give feedback in other forums and then there's a lack of follow up? Something re community link missing here, not quite sure what. cc @jonathanbarker @j-zerah FYI.
re: capturing user/community feedback - have we considered an "engagement survey" type mechanism for our main community users? Similar to one for core contributors, but focused on feedback they have for Status products, features, etc
We have had those in the past and it surely is time to bring them back! It was always a bit more of a one off, never a solid mechanism that better balances effort-output.
Regarding bandwidth, a quick poll in #statusphere brought no alarming response by 3 active community member/contributors. All estimating a monthly 1GB going to Status. Not to say that it's not a problem:)
one plan: have a separate type of public chat that isn't as private (with respect to traffic analysis). You could just have it as an option and have some UI element that notes what kind of public channel it is.
This could also be the default and then we can set up relays that allows people to communicate in these at bandwidth costs (similar to @jakubgs bridge)
Here's a rough tool to check for bandwidth: https://github.com/status-im/status-protocol-go-bandwidth-test
one plan: have a separate type of public chat that isn't as private (with respect to traffic analysis). You could just have it as an option and have some UI element that notes what kind of public channel it is.
The issue is not due to public chats I believe ( we have high usage even without joining any public chat from the previous bandwidth tests ), it's mainly due to discovery topics, as you receive messages not sent to you, while in public chat is the chance is lower (there's a chance that your bloom filter matches some other topic, but it's probably not huge), so unless we completely bypass whisper not sure we can optimize those much, but worth having a look.
have we ever tried playing with a dynamic tuning of the bloom filters based on user preference? It's basically a sliding scale of how much you poll for based on how much information you want to give the server your asking. If a user doesn't care about that, then they can at least minimize the amount of "extra stuff" they're getting.
That's a fairly big chance, it means we would be fundamentally changing how whisper works, basically say you don't even use a bloom filter, but you pass just a list of topics (provides no darkness, but best bandwidth), you still have an issue with the shared topic (currently each user is assigned to a random bucket based on the pk, n = 5000).
We also have a personal topic, that can be used instead of the partitioned, which is the user pk, but at that point any darkness is gone, so makes little sense to use whisper.
I think we need to understand a bit better what's the consumption is coming from, is it coming from extra messages that you don't care about? Is it coming from the fact that you receive multiple copies of each message? or is it just whisper overhead? etc
Once we understand better the dynamics we can see what we can do and where is best to optimize imo
thoroughly agreed.
I think we need to understand a bit better what's the consumption is coming from, is it coming from extra messages that you don't care about? Is it coming from the fact that you receive multiple copies of each message? or is it just whisper overhead? etc
We have quite granular Whisper metrics that can answer most of these questions: https://github.com/status-im/whisper/blob/master/whisperv6/metrics.go. For example, we have envelopeAddedCounter
and envelopeNewAddedCounter
which difference can tell how many duplicates we receive. Or envelopeErrNoBloomMatchCounter
which tells about the number of messages not matching bloom filter.
What we would need to do is to expose them in the app because as far as I know they are used exclusively for statusd running on our servers.
Many open source projects, like Firefox, collects stats and sends them to the centralized servers only if a user agrees to do so. Maybe we can have a similar strategy? It should be opt-in of course.
We had a short meeting today about the badwidth testing and I've noted down some things:
status-protocol-go
and whisper
to measure:
I will start working on those probably next week, as I have to finish some other stuff.
Find a reporting format, preferably fed to Prometheus
Not sure I would recommend pull-based tools for load testing. Unless these load tests will be fairly long. Also, Prometheus due to getting data periodically can miss some fluctuations which might be interesting for us. Maybe writing to InfluxDB? Having all data points can be also an advantage.
I did consider InfluxDB too, we can see what works better. I'd agree that a push rather than pull scheme would work better for benchmarks.
Discuss post: https://discuss.status.im/t/fixing-whisper-for-great-profit/1419 Theoretical model numbers: https://htmlpreview.github.io/?https://github.com/vacp2p/research/blob/master/whisper_scalability/report.html Waku mode draft: https://github.com/status-im/specs/pull/54
@jakubgs any luck with above?
Also it'd be great if we can figure out where other traffic might be coming from, i.e. things that aren't captured by above model. For example, I remember some benchmark saying we spend 20% of traffic on Infura, which seems insane but makes sense given lack of transaction indexing (?). This means it is might become the bottleneck with Waku mode in place, which would hint at using attacking the indexing problem, e.g. with something algorithmic like @yenda was working on, or indexing a la thegraph that @bgits suggested
@oskarth on a new account there would only be a handful of calls for infura afaik. The heavy stuff is only when there is transactions to recover
Here's an update on the current state of my work on this:
Add more granular metrics for status-protocol-go and whisper to measure
status-go
version from the Appstatus-go
versionCurrently I'm not sure how to fix the version issue, it should be fixed in status-go
, but to figure out how to do that correctly I'll have to talk to Adam.
Find a reporting format, preferably fed to Prometheus
I also looked at pushgateway
for Prometheus as an alternative, but that still is dependent on Prometheus pull rate/interval and would not represent the real time creation of the metrics generated by the benchmark.
Run tests for various volumes to check complexity Run tests periodically to measure improvements/regressions
After investigating the status-protocol-go-bandwidth-test package by Andrea I don't think there's anything wrong with his simple approach of just spawning the processes with his run.sh
script. Though it might be a bit nicer if we used something like Supervisord using the numprocs
setting or systemd using instantiated services to orchestrate multiple processes in a more manageable way.
According to Adam the best way to collect these metrics would be to subscribe to the envelopeFeed
:
https://github.com/status-im/whisper/blob/39d4d0a14f/whisperv6/whisper.go#L178-L182
And listen for the EventEnvelopeReceived
event:
https://github.com/status-im/whisper/blob/39d4d0a14f/whisperv6/events.go#L19
Which would allow me to collect envelope metrics(size, numbers) in InfluxDB without having to modify the whisper
repo itself.
I've added a Topic
attribute to Envelope
in https://github.com/status-im/whisper/pull/38.
According to Network Metrics section of Docker docs:
Network metrics are not exposed directly by control groups. There is a good explanation for that: network interfaces exist within the context of network namespaces. The kernel could probably accumulate metrics about packets and bytes sent and received by a group of processes, but those metrics wouldn’t be very useful. You want per-interface metrics (because traffic happening on the local
lo
interface doesn’t really count). But since processes in a singlecgroup
can belong to multiple network namespaces, those metrics would be harder to interpret: multiple network namespaces means multiplelo
interfaces, potentially multipleeth0
interfaces, etc.; so this is why there is no easy way to gather network metrics with control groups.
So as an alternative they propose creating IP Tables rules:
IPtables (or rather, the netfilter framework for which iptables is just an interface) can do some serious accounting. For instance, you can setup a rule to account for the outbound HTTP traffic on a web server:
$ iptables -I OUTPUT -p tcp --sport 80
There is no -j or -g flag, so the rule just counts matched packets and goes to the following rule.
Later, you can check the values of the counters, with:
$ iptables -nxvL OUTPUT
Which can be an issue since access to IP Tables requires root
priviledges.
An alternative are "Interface-Level Counters":
Since each container has a virtual Ethernet interface, you might want to check directly the TX and RX counters of this interface.
It just requires some juggling to get the data. Assuming that $CID
is ID of a container we want:
TASKS=/sys/fs/cgroup/devices/docker/$CID/tasks
PID=$(head -n 1 $TASKS)
mkdir -p /var/run/netns
ln -sf /proc/$PID/ns/net /var/run/netns/$CID
ip netns exec $CID netstat -i
Which should get use output like this:
admin@mail-01.do-ams3.eth.test:~ % sudo ip netns exec $CID netstat -i
Kernel Interface table
Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 365190 0 0 0 305317 0 0 0 BMRU
lo 65536 0 0 0 0 0 0 0 0 LRU
Which gives us things like:
(RX|TX)-OK
- Packets received/sent correctly.(RX|TX)-ERR
- Packets received/sent but with incorrect checksum.(RX|TX)-DRP
- Packets dropped because of full buffer.(RX|TX)-OVR
- Packets dropped due to exceeding TTL or other timing reason.Now, packets are nice and all but we don't know their sizes, so that doesn't give us actual bandwidth.
Problem
As a user with a limited data plan, I want the bandwidth usage to be substantially lower, so that I can use Status on 3G/4G with a limited data plan.
Details
Using iOS, current period under Cellular data says the following for similar apps:
Note that all apps outside of Status have attachments and images in them.
To calibrate for usage, here are the corresponding numbers for Screentime the last 7 days:
Note that in Line and Signal I'm not in any public channels, but in Telegram I'm in several that are a lot more noisy than Status.
Comparing to Telegram, Line and Signal, this means we currently consume 10-20x more bandwidth, without attachments. As a user, this is an unacceptable user experience.
Implementation
As a somewhat representative user, but with a limited data plan I care more about this than cover traffic/metadata protection.
Acceptance Criteria
Bandwidth usage reduced 10-20x so that it is within a factor or three of comparable apps, like Telegram, Line and Signal.
Notes
In light of the current financial situation, timeline, and growing the core app user base, it might be the case that we partition the problem in two:
a) Continue long-term 'fundamental' research in conjunction with other projects to develop a better alternative (Block.Science/Swarm/Nym/libp2p) b) bandaid to help with adoption and traction of Status the app, w/o as strong metadata/decentralization guarantees, a la Infura-for-chat (basically what we have already with mailserver)
Side note: I also searched for 'bandwidth' in open issues and couldn't find a relevant one, which is a bit surprising given that it's a very common user complaint, anecdotally. User feedback not making its way into concrete problem descriptions? cc @rachelhamlin @hesterbruikman
Future Steps
Replace Whisper.