arlolra / ctypes-otr

js-ctypes wrapper for libotr
Mozilla Public License 2.0
29 stars 7 forks source link

CONIKS preferences GUI #82

Closed vqhuy closed 8 years ago

vqhuy commented 8 years ago

I used a flag to enable and disable CONIKS manually. However, the OTR preferences panel is modified a little bit. Below is the screenshot of the GUI with

vqhuy commented 8 years ago

@arlolra : Should I use sqlite db or just plain text files as libotr? Below are things that I think would be stored for the CONIKS protocol:

I would like to count @masomel in since she has experience of working with Pidgin plugin.

arlolra commented 8 years ago

Should I use sqlite db or just plain text files as libotr?

I think my preference is for text, but JSON over S-expressions. Was there a reason why you went with a db though?

vqhuy commented 8 years ago

Since these data change frequently, I think.

arlolra commented 8 years ago

Since these data change frequently, I think.

Good point. Let's think about that ...

vqhuy commented 8 years ago

Since these data change frequently, I think.

It looks like this is the only reason why my preference is for sqlite. Especially, the STR of each user could be changed more frequently depending on the server's epoch deadline. Moreover, the STR change's times also vary among accounts. So I think using plaintext storage here may not have advantages over sqlite in the performance.

arlolra commented 8 years ago

the STR of each user

Can you clarify what you mean by this? There's only one STR per epoch. And, the only thing you need to retain is the last epoch where you monitored your own binding, so you can do the catchup routine.

vqhuy commented 8 years ago

the STR of each user

Oops. I mean the authentication path. It is true that there's only one STR for each epoch, corresponding with one keyserver. But there is chance that the user uses different keyservers for different accounts, so we can still have more than one STR need to retain, right?

And, the only thing you need to retain is the last epoch where you monitored your own binding,

Also the last tree hash, isn't it?

arlolra commented 8 years ago

Oops. I mean the authentication path.

Why do you need to retain the auth path though? At each epoch, you need to monitor your own binding. And at each new conversation, you need to lookup your contact's binding in the current epoch. But nothing there needs to be retained, does it?

It is true that there's only one STR for each epoch, corresponding with one keyserver. But there is chance that the user uses different keyservers for different accounts, so we can still have more than one STR need to retain, right?

One STR per identity provider, right. My confusion was that I thought you meant contact, not account. Thanks for clarifying. In practice though, there's just the one CONIKS server we're running. But good to think ahead.

Also the last tree hash, isn't it?

Sorry, yes, I wasn't being precise. You need to do a comparison with, https://github.com/coniks-sys/coniks-go/blob/d328ad1cda8fed377875135ba905417bd1db22a9/merkletree/str.go#L24

From my understanding, we need to serialize to disk at each epoch (per distinct CONIKS server), and on demand per account policy changes. Is that right? And does that seem too often?

vqhuy commented 8 years ago

Why do you need to retain the auth path though?

No, we don't. Sorry for not clarifying. I was wrong in previous comments.

there's just the one CONIKS server we're running.

If so, I also prefer plaintext storage.

From my understanding, we need to serialize to disk at each epoch (per distinct CONIKS server), and on demand per account policy changes

I think you're right.

And does that seem too often?

Not sure but I think it depends on the server's epoch deadline?

arlolra commented 8 years ago

Not sure but I think it depends on the server's epoch deadline?

Yup, so the paper's performance evaluation puts the epochs occurring once per hour. If we use that as a rule of thumb here, what do you think?

vqhuy commented 8 years ago

If we use that as a rule of thumb here, what do you think?

Yeah, reading/writing text file once per hour seems not really frequent. So I will fix this pull to use text file with JSON format for storage and probably use some struct like ConiksState as @masomel did here (https://github.com/coniks-sys/pidgin-otr-fork/blob/coniks-integration/coniks.h#L29-L37).

masomel commented 8 years ago

I would like to count @masomel in since she has experience of working with Pidgin plugin.

Sorry for joining in so late. I could have sworn I responded to this a few days ago, but apparently I didn't hit submit and lost my response >:(

Seems like you two already agreed on this, but I wanted to weigh in anyway and say that it's true that there should only be a single STR and that the auth path does not need to be retained by the client once verified. What does need to be retained for each user, however, are the TBs so that the client can verify in the next epoch that they've been included in the tree (then the TB can be deleted).

I agree with @arlolra, my preference for text files is also JSON over s-expressions. But my suggestion is actually to use binary files to store some of the data on the client. In the case of the latest STR, we're already receiving the data in bytes, and storing this data even as UTF-8 chars in hex would take up twice as much space. The length of most fields is fixed (and known to the client) and the client needs this data in bytes anyway for later verifications so we would save some cycles on converting. The only data that we would probably need to store in plain text is the policy. The TB could similarly also be stored in binary.

The user policies could be stored along with the user keys. IIRC, the Pidgin plugin has a single S-expression file for a public keys (the user's and the buddies'). I think it might make sense to have such a file in JSON per user instead since we're storing more state than libotr per user.

vqhuy commented 8 years ago

I agree with @arlolra, my preference for text files is also JSON over s-expressions. But my suggestion is actually to use binary files to store some of the data on the client. In the case of the latest STR, we're already receiving the data in bytes, and storing this data even as UTF-8 chars in hex would take up twice as much space. The length of most fields is fixed (and known to the client) and the client needs this data in bytes anyway for later verifications so we would save some cycles on converting. The only data that we would probably need to store in plain text is the policy. The TB could similarly also be stored in binary.

Yes, thanks for this advices.

The user policies could be stored along with the user keys. IIRC, the Pidgin plugin has a single S-expression file for a public keys (the user's and the buddies'). I think it might make sense to have such a file in JSON per user instead since we're storing more state than libotr per user.

I imagined that there will have a file named coniks.policies to store users' policies in JSON, along with current otr.fingerprints file which is store buddies' fingerprints. Since otr.fingerprints is already being used by libotr, I think we could leave it as it is, and have a separate file for users' policies.

masomel commented 8 years ago

I imagined that there will have a file named coniks.policies to store users' policies in JSON, along with current otr.fingerprints file which is store buddies' fingerprints. Since otr.fingerprints is already being used by libotr, I think we could leave it as it is, and have a separate file for users' policies.

This seems fine to me. My only concern with this approach is that libotr stores the fingerprints as S-expressions, and I'd rather not start mixing encodings in the client. My other question is about the overhead in searching the appropriate entries in these files vs keeping separate files with all the data for each user. Is this something we should be worried about?

vqhuy commented 8 years ago

My only concern with this approach is that libotr stores the fingerprints as S-expressions, and I'd rather not start mixing encodings in the client.

Hmm, I think since our encodings completely belong to CONIKS module, and it doesn't have any thing related to S-expressions of libotr, so maybe it's still fine?

My other question is about the overhead in searching the appropriate entries in these files vs keeping separate files with all the data for each user. Is this something we should be worried about?

What do you mean "searching the appropriate entries in these files"? What I am going to do is loading all the data for each user at first, and then update them to files each time these data change, so the searching could be done in memory.

masomel commented 8 years ago

Hmm, I think since our encodings completely belong to CONIKS module, and it doesn't have any thing related to S-expressions of libotr, so maybe it's still fine?

I might be misunderstanding something, but why would the CONIKS module never have to deal with the S-expressions files? Doesn't the CONIKS client still have to read and write the otr.fingerprints file?

What do you mean "searching the appropriate entries in these files"? What I am going to do is loading all the data for each user at first, and then update them to files each time these data change, so the searching could be done in memory.

Ah I see. I've been reading through some libotr code, and it seems like they also load all fingerprints into memory through the OtrlUserState struct at the beginning. I remembered incorrectly that libotr would have to open the otr.fingerprints and scan through it the file to find the mapping from the username to the key fingerprint. Talking about performance again, I was thinking it could be more efficient if we didn't have to (1) keep all known fingerprints in memory all the time, and (2) we didn't need to search through memory to find a specific fingerprint, but just go open the appropriate file when needed. There's probably a tradeoff here again, too, between memory overhead and cycles spent on file IO.

vqhuy commented 8 years ago

Doesn't the CONIKS client still have to read and write the otr.fingerprints file?

It does. But I think current Tor Messenger can use CONIKS for fingerprint verification, and leave libotr take care all the rest.

if we didn't have to (1) keep all known fingerprints in memory all the time,

Unfortunately, it is what libotr is doing. So should we do the same?

masomel commented 8 years ago

leave libotr take care all the rest.

You're absolutely right. I keep on thinking that all of these changes would happen in the add-on. wasn't thinking about all of the changes we would have to make in libotr.

Unfortunately, it is what libotr is doing. So should we do the same?

If this is what libotr is doing we definitely shouldn't change it. But if we can make some improvements on the CONIKS side, I think we should.

vqhuy commented 8 years ago

I keep on thinking that all of these changes would happen in the add-on. wasn't thinking about all of the changes we would have to make in libotr.

We may misunderstand each other (again) :D. I mean all these changes still happen in the add-on, and I'm not going to do any change in libotr.

But if we can make some improvements on the CONIKS side, I think we should.

Totally agree. So come back to our issues:

My only concern with this approach is that libotr stores the fingerprints as S-expressions, and I'd rather not start mixing encodings in the client.

I still have a vote for JSON since I think the CONIKS module would never have to deal with the S-expressions files. (oh, I just realized that this reason is a loop, it's not a actual reason).

I was thinking it could be more efficient if we didn't have to (1) keep all known fingerprints in memory all the time, and (2) we didn't need to search through memory to find a specific fingerprint, but just go open the appropriate file when needed.

I just gave a try in 55a2f39101e71e0fcc1ab5ae23d9879c26a966fc. What do you think?

masomel commented 8 years ago

We may misunderstand each other (again) :D. I mean all these changes still happen in the add-on, and I'm not going to do any change in libotr.

Perhaps :) I understood that you're only making changes to the add-on. What I was saying is that the changes I wanted (to not use S-expressions anywhere) would all have to be in libotr, which we're not touching.

I still have a vote for JSON since I think the CONIKS module would never have to deal with the S-expressions files.

I agree that we should use JSON wherever we can. But I thought we said before that the CONIKS module needs to access the otr.fingerprints file, which is an S-expression file. All CONIKS has to do is get the fingerprints from the OtrlUserState struct, right?

masomel commented 8 years ago

Looking at https://github.com/arlolra/ctypes-otr/commit/55a2f39101e71e0fcc1ab5ae23d9879c26a966fc now.

EDIT: I've left several comments in the commit, but I have a few high-level comments. There seem to be some inconsistencies in the naming of functions and objects, so it wasn't always immediately clear to me what the purpose of the function is (or why it's needed if a different module has a function with the same name). With variables, since JS infers the type, it's especially important to give variables meaningful names.

My other comment is that some of the code kind of verbose/awkward at times. I think the reason is that we're using a single policy file for all accounts, while having individual ConiksContext for each account. For example, in coniks.UpdateUserPolicies() you end up having to pass the entire context map to a function that only operates on a single context because you have to save the entire map to disk after the single context update. I would consider storing individual policy files per account so that you can easily map from a policy file to a ConiksContext.

vqhuy commented 8 years ago

EDIT: I've left several comments in the commit, but I have a few high-level comments. There seem to be some inconsistencies in the naming of functions and objects, so it wasn't always immediately clear to me what the purpose of the function is (or why it's needed if a different module has a function with the same name). With variables, since JS infers the type, it's especially important to give variables meaningful names.

Thanks! I'll pay more attention next time.

My other comment is that some of the code kind of verbose/awkward at times. I think the reason is that we're using a single policy file for all accounts, while having individual ConiksContext for each account. For example, in coniks.UpdateUserPolicies() you end up having to pass the entire context map to a function that only operates on a single context because you have to save the entire map to disk after the single context update. I would consider storing individual policy files per account so that you can easily map from a policy file to a ConiksContext.

I'll think about it. I see a issue you pointed here is that we have to save entire map to disk whenever there is a single context change, right? But I think writing entire map content and writing a single element content should not have much difference in I/O (I'm saying about the I/O buffer for reading/writing).

Update: I also fixed naming issue in f9a98f582dc426e6da131151f178754664203345. Thanks for your comments!

masomel commented 8 years ago

Thanks for fixing the naming issues!

I see a issue you pointed here is that we have to save entire map to disk whenever there is a single context change, right?

Right.

But I think writing entire map content and writing a single element content should not have much difference in I/O (I'm saying about the I/O buffer for reading/writing).

Could you please elaborate on this? I'm not sure what you mean exactly.

vqhuy commented 8 years ago

Could you please elaborate on this? I'm not sure what you mean exactly.

Sorry for not being clear. I think your concern is that it would have some performance issues if we write entire the map to the disk, even if we just have a single change. So I understand that the performance is related to the buffer for I/O writing (buffer cache or kind of this ...), I mean there should not have much difference performance here. What do you think?

vqhuy commented 8 years ago

I think this pull is already for another review.

masomel commented 8 years ago

LGTM so far!

vqhuy commented 8 years ago

When I work on the registration function, I think the ConiksContext should also contain account's fingerprint as well. It's easier when we send the registration message (just need to read the fingerprint from _contexts), and when we verify the tb response (comparing the returned value with account's fingerprint). Thus, I think it makes more sense to remove the coniksPolicies and move all the file i/o handler to ConiksContexts (with extra 's'), to reflects exactly what it does. The change is in 0244fa5931b1e53f8eb5b0c27fee6ac9fe2e2220. What do you think? @masomel @arlolra

masomel commented 8 years ago

When I work on the registration function, I think the ConiksContext should also contain account's fingerprint as well.

This makes sense to me, so the CONIKS client then doesn't have to deal with any S-expression files either, right? The only issue I see is that the fingerprint is stored twice now.

This sort of comes back to the discussion we had before about a single contexts file for all accounts vs a context file per account. Do we still expect that the single contexts file will change infrequently?

Thus, I think it makes more sense to remove the coniksPolicies and move all the file i/o handler to ConiksContexts (with extra 's'), to reflects exactly what it does.

I think this also makes sense. I don't know if it will be confusing for a developer to have both the ConiksContext class and the ConiksContexts variable for file IO. But I definitely like the decision you've made here.

vqhuy commented 8 years ago

This sort of comes back to the discussion we had before about a single contexts file for all accounts vs a context file per account. Do we still expect that the single contexts file will change infrequently?

I think I'm not going to store any contacts' fingerprint in our coniks.contexts file, since the contacts' fingerprints are already loaded in OtrlUserState, and all the verifications/comparisons related to lookup function would be performed in memory, without any file IO. Thus, I think the single contexts file still changes infrequently. What do you think?

masomel commented 8 years ago

I think I'm not going to store any contacts' fingerprint in our coniks.contexts file, since the contacts' fingerprints are already loaded in OtrlUserState, and all the verifications/comparisons related to lookup function would be performed in memory, without any file IO. Thus, I think the single contexts file still changes infrequently. What do you think?

I think this is fine!

masomel commented 8 years ago

LGTM. The only question I have is, is it possible to make the distinction between ConiksContext and ConiksContexts more clear?

vqhuy commented 8 years ago

The only question I have is, is it possible to make the distinction between ConiksContext and ConiksContexts more clear?

Is this a naming problem? I'm probably not good at this :(

masomel commented 8 years ago

Is this a naming problem? I'm probably not good at this :(

Good naming isn't easy, so don't worry too much! You said earlier that ConiksContexts has the file IO handlers for the contexts, maybe something like ContextsFile is clearer? I don't know if you then want to rename the loadFromFile to just load and storeToFile to store, so you don't have something like ContextsFile.loadFromFile().

vqhuy commented 8 years ago

I don't know if you then want to rename the loadFromFile to just load and storeToFile to store, so you don't have something like ContextsFile.loadFromFile().

Yup, I changed it in 0373dc448c5f333e29e20f2a8d99cc2faf8d7408. Thank you all for your comments so far!

arlolra commented 8 years ago

There's a bit of an assumption in this patch that all accounts will be created after coniks is enabled (particularly around storing the fingerprint). But let's not fix that now; I like what we have so far.

arlolra commented 8 years ago

Merged in f28786754492e988c92a09e8c66760379b08e4c2

arlolra commented 8 years ago

Nice work Huy! I opened a few tasks for some follow ups.