TokTok / c-toxcore

The future of online communications.
https://tox.chat
GNU General Public License v3.0
2.25k stars 284 forks source link

Make a new portable simple standard tox save format #222

Open iphydf opened 7 years ago

iphydf commented 7 years ago

See #215 for a related discussion.

isotoxin commented 7 years ago

Yet another external dependency? Oh no! Please! Just write simple json writter/reader. It will also be useful for some metadata transfer between clients. And tricky question: how do you provide support this enchancement by other clients? As I know, no one client uses TokTok (Isotoxin uses something own with mixing code from original and toktok)

alexbakker commented 7 years ago

I think this format should provide a simple key-value store where clients can store things like the username, the online status, the status message and a list of friend aliases (see also: #112). Some standard keys would then be defined by the TCS to ensure client interoperability.

sudden6 commented 7 years ago

I'll throw in https://github.com/TokTok/c-toxcore/issues/112#issuecomment-257151226

Zer0-One commented 7 years ago

@isotoxin It seems wrong to try to implement a JSON parser in a library responsible for handling secure communications, as you're just widening attack/bug surface. If, for example, you wanted a JSON config, then better to defer that to a well-maintained library which was meant to handle JSON (and only handle JSON). https://github.com/akheron/jansson/ seems like an ideal candidate for the JSON case; it's just as portable as toxcore, and is even suited for embedded platforms.

sudden6 commented 7 years ago

After reading the docs I agree that the current format is horrible and must die.

To minimize the stuff needed to implement for a minimal client I think the file should be split into two parts:

The basic part would contain the minimum amount of data to transfer friends(to be decided) across clients in a simple format (maybe even binary) and must not need an external library to parse.

The optional part would be appended to the basic part and is JSON (or whatever) encoded. Some keys needed for every client, like aliases, last time online,...

Discuss.

master-passeli commented 7 years ago

I like the idea of JSON format since it is easy to read (for human) and is cross platform safe by it self. Indeed it requires one additional and external library but atleast in some cases clients can use the same library to handle they own part of config handling (optional part of config). Also, i'm having hard time thinking case where handling of cross platform configuration file, possibly containing rich content, is done without external libraries (JSON, msgpack, protobuf, whatever).

I like the sudden6's idea that tox save format could contain two parts. Basic part - Contains essential and all absolutely needed parts of data to get account functional (only part which is really intepretted by toxcore).This basic part can be even versioned so that newer versions of toxcore could have attributes which are not undestood by older versions without problems.

Optional part - Part which can contain client specific information or other information which is not required to get things working. This optional part would be something client can ask from toxcore. Clients could also ask toxcore to store something in this part of config. Any part of optional config should NOT be intepretted by toxcore but core gives it to the client as string. Clients can handle it as they want.

With JSON format, encrypting save file is trivial. Toxcore can encrypt save file by using any commonly used symmetric encryption algorithm with user given password. At the same time, encryption is not needed if user does not want to encrypt his save file and in this case toxcore can skip encryption/decryption phase.

About security of two part idea. If config would contain two separate parts the basic part would be the only one which is really intepretted by toxcore. From this part toxcore reads all attributes to it's internal structures and this part is constructed back when toxcore saves the save file. For optional part of save file toxcore would introduce ONLY get and put functions for client. Those functions are used by clients to get and give they optional config parts. Toxcore should never reveal any JSON library related functions to client side or try to intepret optional parts. However, toxcore should validate JSON syntax of optional part in order to detect case where client is trying to include invalid config section. If i'm right, any invalid section in JSON makes whole JSON structure invalid. This must be detected by toxcore and invalid inclusion must not be alloved. Syntax validation of JSON should be pretty simple, safe and most probably already available by libraries.

Let's think about pros and cons of JSON config file with basic and optional sections: PROS

CONS

After all, I like the idea of JSON format for tox save file. However, implementation of it must be done carefully! More discussion please...

sudden6 commented 7 years ago

inefficient for binary data (friend avatars?)

IIRC avatars aren't stored in the *.tox file right now and I think it would be a really bad idea to store that much data in the config file.

Basic and optional splitting introduces security concerns

Why?

If clients can freely include anything they want to the optional part of config save file may become huge pile of carbage (see pros. -> save simple version).

Yep, that could grow into a huge problem.

toxcore must syntax validate but not intepret optional parts

Also a possible solution would be to let the client handle JSON parsing with a library by it's choice and just init toxcore with an appropriat config struct/function. But I'm not quite sure this is a good approach, as it's not guaranteed that the output files will be compatible.

Another idea is, to scrap the concept of a "Tox Save File" and just provide a "Tox Export File". Instead of saving/loading everything from the save file on every client start, the client would load the data needed for toxcore operation from it's own database/savefile and only provide a way to import/export the data in the specified format for exchange with other clients. PROS

CONS

All in all I think before tackling the how to save stuff, we should first decide on what to save in this file.

zetok commented 7 years ago

@master-passeli

With JSON format, encrypting save file is trivial.

Encrypting/decrypting is trivial regardless of the data format.


@sudden6

Also a possible solution would be to let the client handle JSON parsing with a library by it's choice and just init toxcore with an appropriat config struct/function.

Err, no, not a solution, but doing things the wrong way.

master-passeli commented 7 years ago

I agree that saving avatars in tox save file might be bad idea but avatar was just an example. I can not think now any binary data we would like to save in save file but in future we might have. Future is a bit hard to predict. :)

Basic and optional parts has some security concerns if optional part is something clients can freely add in. That opens few ways how bugy client can set tox core in danger by giving well crafted or bugy data. I belive that this possibility could be minimized by making sure that parts added by client are not intepreted by toxcore and toxcore syntacticaly validates the parts clients want to include.

@sudden6

I exactly mean that clients can use whatever parsing method for they own config part which is saved into and loaded from file by toxcore. For example qTox does not need any external JSON parser because i think Qt has it's own JSON parser already. Also, if client is written with some other language they could have they own native JSON parsers and C parser library just makes things more complicated there.

I don't like the idea where clients implements they own export file. I'm worried that by that way we just get big number of uncompatible export files and huge mess. This should be toxcore's responsibility and feature.

If this kind basic and optional part separation is too risky at any matter, i still give +1 for JSON format. I have seen way too many bt XML configs or piles of binary .