element-hq / element-desktop

A glossy Matrix collaboration client for desktop.
https://element.io
GNU Affero General Public License v3.0
1.16k stars 265 forks source link

Seshat fails to index older messages in some rooms. #915

Open ara4n opened 4 years ago

ara4n commented 4 years ago

Both @AmandineLP and @neilisfragile have ended up with partial seshat indexes. @poljar is on the case; i don't think we have a bug for it, so opening this one for tracking purposes.

jryans commented 4 years ago

According to @poljar, this is believed to be fixed in the most recent release of Seshat, so there's nothing further to do here. If it does come back, we can take another look.

michaelsmoody commented 4 years ago

Good afternoon,

I'm testing the nightly from today (April 25 2020), and the index of rooms seems to have stalled (confirmed by trying to search for words that are spread throughout the room history).

Riot is securely caching encrypted messages locally for them to appear in search results:
Not currently indexing messages for any room.
Space used: 2 MB
Indexed messages: 2,269
Indexed rooms: 8 out of 8

Looks like it's similar to vector-im/element-web#13259, so I'm not sure which one to put this under, nor do I have any way to determine where it's stuck that I know of.

I have disabled and re-enabled the search, and it creeps up little by little on the message count each time I do, but in the end, I'm not sure if all messages are indexed. Last time I disabled and re-enabled, it was up to ~6k.

t3chguy commented 4 years ago

Do you have a gap of messages for which you don't have keys by any chance?

michaelsmoody commented 4 years ago

Not to my knowledge. I suppose it's possible there's one or two that have happened from other bugs, the occasional unable to decrypt, but no large gaps. I have my keys going back to day 1. Is there a log anywhere that I can reference?

poljar commented 4 years ago

I don't think this is similar to https://github.com/vector-im/riot-web/issues/13259. There it got stuck in a loop never finishing, this one claims 8 out of 8 is done so it got to the start of your room history for all 8 rooms.

I'm not saying that messages aren't missing but it's for a different reason.

Since it's a huge discrepancy it's unlikely that you're missing messages because it's giving up too early due to the same bug for which this was closed.

Could you perhaps disable/enable and send out a rageshake once it starts and once it's done?

You can also watch the developer console accessible with ctrl + shift + i, it will tell you if it's skipping undecryptable messages for example.

michaelsmoody commented 4 years ago

Our of several thousand (nearly 7k) messages, there are the results. In fairness, I didn't know what was potentially sensitive information here, so I obfuscated all of the values. I do of course have the original as well, assuming that you can guide me about what is sensitive and needing removal.

I will note that disabling/re-enabling does seem to finish. What I suppose I need to do at this point, is start completely fresh (remove the client, delete all local data, add it again, and see if it stops at the ~2k range again, and get THOSE details).

RiotNightlyRageShake-obfuscated.txt

michaelsmoody commented 4 years ago

So, original install was here:

Not currently indexing messages for any room.
Space used: 2 MB
Indexed messages: 2,269
Indexed rooms: 8 out of 8

Disabling and re-enabling multiple times got me here:

Not currently indexing messages for any room.
Space used: 5 MB
Indexed messages: 7,001
Indexed rooms: 8 out of 8
poljar commented 4 years ago

How do you restore your encryption keys when you log in again after you completely wipe your data.

This seems like your client doesn't have all the encryption keys at the time of the first run, but it requests them from other devices once it gets to messages that it can't decrypt.

The messages won't decrypt in time to be added to the index but once you rerun the indexing step the keys will be there and decryption will succeed this time.

None of the data in the rageshake is sensitive besides maybe the room names, though if you send a rageshake using the in-client provided method it will be sent them to a private location. The room names are unencrypted and server owners can see them.

jryans commented 4 years ago

@michaelsmoody Are you able to provide more info on how you are restoring keys?

michaelsmoody commented 4 years ago

My apologies, I didn't see your original response (and I was looking for it, probably fell into an email black hole).

I've always used key backup via Riot. The very few instances you see here are almost certainly unique events, and possibly bugs, where I see a message, and then immediately after, a duplicate message (same timestamp) where it states unable to decrypt. I've, thus far, never had an issue with key backup/restore (using server-side backup), or via export/import keys. But since day one, I've used server-side key backup.

tim-seoss commented 3 years ago

I'm having similar problems with encrypted message search. On a relatively new (1 week old) Element Desktop installation, Initially it got stuck at:

Element is securely caching encrypted messages locally for them to appear in search results:
Not currently indexing messages for any room.
Space used: 231 KB
Indexed messages: 85
Indexed rooms: 23 out of 23

Disabling and then re-enabling encryption got this to change to:

Not currently indexing messages for any room.
Space used: 6 MB
Indexed messages: 4,200
Indexed rooms: 23 out of 23

but for at least one room, no results are being returned. I have a backup of the Element Desktop state when it was stuck at 85 messages if that's of any use, and can debug things locally too (I have some familiarity with Rust if that's relevant for the Seshat side of things, but I haven't done any javascript for a few years although I can probably muddle through).