Closed jocelynthode closed 8 years ago
There are many writes on ~/.local/share/gajim/omemo_thehost.db for each message sent. In particular I think the fdatasync
syscall is called 2*(number of omemo keys) on this file, for every message sent.
I also don't notice any lag when ~/.local/share/gajim/ is on a tmpfs
.
hm, if you are sure its because of I/O, im at a loss what can be done about it.
you can try to install python-cryptography, which gives you native encryption (fast), so you could rule out that.
I have python-cryptography installed.
Edit: I'll try testing it on a system with ext4, to see if btrfs might be part of the problem or not
so maybe its specific to btrfs?!
i wouldnt call that a global linux issue, i think its specific to your system. plugin is out for 6 months, and until now no one reported that behaviour.
all the plugin does is call standard sqlite operations.
maybe update sqlite? try installing the gajim on another harddisk
I tried gajim + omemo on multiple machines and OSes, and I found that the lag was noticeable both on ArchLinux and Ubuntu 16.04. However it felt significantly worse on ArchLinux + btrfs + LUKS than on Ubuntu + ext4). I also tried running gajim on a portable Win10 installation on a slow USB2 HDD, and there the lag was barely noticeable (~ less than 0.5sec). My testing methodology wasn't very consistent, so don't count too much on these results. I worry that this issue is related to our XMPP server in some way.
I'll try to profile gajim disk and CPU usage in the hope to find more informative results. I may also try to use OMEMO with many clients on a public XMPP server supporting similar XEPs.
JFYI, I use Gajim + OMEMO on Arch Linux on different machines and I never noticed any lag. The setups span from low end Atom CPU to i5, HDD or SSD and I use both ext4 and btrfs (the latter only on SSD though), with LUKS encryption.
your xmpp server has nothing to do with it, the process of sending a message does not depend on a working server.
does the problem really only exist if you write an encrypted message?
Yes the problem only exists when sending encrypted messages
I am experiencing the same problem,
When debug is enabled, it takes about 2 seconds until "gajim.plugin_system.omemo Finished encrypting message" is shown in the log.
if you know how to do this you could run gajim with cProfile, so then you could see which method needs so long
I dont know if i've done it correctly, but the output of this command
python -m cProfile -s time /usr/share/gajim/src/gajim.py
shows:
83 7.381 0.089 7.381 0.089 {method 'commit' of 'sqlite3.Connection' objects}
All other methods seem to be fine.
Edit: Delay is now ~500ms, better than half an hour ago.
169 0.197 0.001 0.197 0.001 {method 'commit' of 'sqlite3.Connection' objects}
you see i have double your calls, but need only a fraction of the time.
either something is wrong with your sql3 verison, or maybe with your filesystem/harddisk
i tested this in virtual machine ubuntu 16.10 with an SSD
My version of sqlite3 is sqlite3.sqlite_version='3.11.0'
. What version were you using?
My filesystem is ext4, I did a fresh install of Xubuntu just yesterday, HDD's SMART report does not show any abnormalities and a friend of mine had the same problems on another device with an SSD (same distribution though). However, in his case, the problem disappeared after a while. I'll try it on Arch tomorrow to see if this is a distro-specific issue.
Sorry I had no time, tested it today. No problems so far on Arch. Issue seems to be distro-specific or an upstream one.
@heythatsunfair Probably not distro-specific as I'm using Arch too. my sqlite version:
3.15.0 2016-10-14 10:20:30 707875582fcba352b4906a595ad89198d84711d8
@lovetox I get this:
22 5.960 0.271 5.960 0.271 {method 'commit' of 'sqlite3.Connection' objects}
It is at the top of the list ordered by internal time
please test this with cprofile
import sqlite3
import os
from os.path import expanduser
home = expanduser("~")
db_path = os.path.join(home, 'test.db')
print(db_path)
conn = sqlite3.connect(db_path, check_same_thread=False)
conn.text_factory = bytes
create_tables = '''
CREATE TABLE IF NOT EXISTS identities (
_id INTEGER PRIMARY KEY AUTOINCREMENT, recipient_id TEXT,
registration_id INTEGER, public_key BLOB, private_key BLOB,
next_prekey_id INTEGER, timestamp INTEGER, trust INTEGER,
shown INTEGER DEFAULT 0);
'''
create_db_sql = """
BEGIN TRANSACTION;
%s
PRAGMA user_version=0;
END TRANSACTION;
""" % (create_tables)
conn.executescript(create_db_sql)
def commit_row(string, int_):
q = """INSERT OR REPLACE INTO identities (recipient_id, registration_id)
VALUES (?, ?) """
c = conn.cursor()
c.execute(q, (string, int_))
conn.commit()
x = 0
while x < 100:
string = 'teststring' + str(x)
commit_row(string, x)
x += 1
i have the following result
with ubuntu 16.10 in a VM
ncalls tottime percall cumtime percall filename:lineno(function)
100 0.067 0.001 0.067 0.001 {method 'commit' of 'sqlite3.Connection' objects}
I ran python -m cProfile -s time test.py
and python2.7 -m cProfile -s time test.py
With python 3.5.2 and sqlite 3.15.0:
100 39.543 0.395 39.543 0.395 {method 'commit' of 'sqlite3.Connection' objects}
With python 2.7.12 and sqlite 3.15.0:
100 36.258 0.363 36.258 0.363 {method 'commit' of 'sqlite3.Connection' objects}
ok so im at a loss here, you see the python script has nothing to do with encryption, its just 100 commits of a string and an int into a sqlite db.
so there has to be something wrong, commiting 100 rows to a db can never take 39 seconds, i guess report a bug with your distribution?
Maybe it is due to some sqlite configuration resulting in differences between our system? I tried running the script with Write-ahead logging enabled to see if it changed anything for me and with that I'm down to:
100 8.372 0.084 8.372 0.084 {method 'commit' of 'sqlite3.Connection' objects}
This is significantly slower in my case although it still big.
http://www.sqlite.org/wal.html
Anyway I'll try investigating further
Update: I also set the PRAGMA synchronous = normal and I get the following results:
100 0.005 0.000 0.005 0.000 {method 'commit' of 'sqlite3.Connection' objects}
Edit: I wonder if some PRAGMA are disabled when omemo writes to the db, because normal messages cause no trouble.
i think you have to set that on every connection to a DB new, default should be FULL
what is your default? in ubuntu its FULL
you basically reduced the writes to your disk, which doesnt really answer the question why it writes so slowly
ok i found out, i just tried this with my non-ssd disk and it took over 10 seconds
By default it seems to be journal_mode=memory and synchronous=FULL.
And yes I am not on a SSD.
so i will add this to the plugin "PRAGMA synchronous=NORMAL;" "PRAGMA journal_mode=MEMORY;"
could you benchmark this without WAL?
With:
conn.cursor().execute("PRAGMA synchronous = NORMAL;")
conn.cursor().execute("PRAGMA journal_mode = MEMORY;")
I still get: 100 12.043 0.120 12.043 0.120 {method 'commit' of 'sqlite3.Connection' objects}
With WAL it gets really fast.
there are other considerations to make then speed. WAL is fast because it doesnt write to the database so often. that means also we could lose data in a crash or corrupt the database.
i will add
"PRAGMA synchronous=NORMAL;"
"PRAGMA journal_mode=MEMORY;"
that more than halfes the time for writing.
btw "PRAGMA journal_mode=WAL;" is a persistent setting, so you only have to execute it once on your DB and it will from then on work in WAL mode. that with the added synchronous=NORMAL i will add should get you what you want :)
also initial 40 seconds is huge, is your harddisk old? or particular slow? i run this on my 3TB WD harddisk, and with default settings it took 10 seconds
Thanks! It might be nice to have a warning somewhere in case people also experiment the slowness.
Yes my harddisk is 3+ years old.
https://github.com/omemo/gajim-omemo/commit/a083f40aa391b7ba9566a71f721ec181ba145ee7
thanks for working this out !
I upgraded right now and everything's working fine!
General information
Steps to reproduce
Expected result
Sending a message should be instantaneous (as it is on Gajim windows). but strangely everytime I send a message, gajim freezes for 2seconds before delivering it which makes it unusable.
It might be related to Disk I/O. when sending an unencrypted message, iotop indicates the I/O is really quick and is at ~0.06% (writing to the db). When sending an encrypted message, the io to the OMEMO db is at 45%+.
Debug output
OMEMO didn't output anything in the logs. Here is a part of the exception that aren't related to OMEMO: