Add Plaintext SDB Format

radareorg / sdb

Simple and fast string based key-value database with support for arrays and json

https://www.radare.org/

MIT License

218 stars 62 forks source link

Add Plaintext SDB Format #219

Closed thestr4ng3r closed 3 years ago

thestr4ng3r commented 4 years ago

see comments in the beginning of text.c for more info.

Closing issues

Fix #214

thestr4ng3r commented 3 years ago

I'm not super familiar with SDB internals, but by reading the issues it said SDB shall be able to automatically detect if loading a text format or a "compiled" one. Where is this thing done?

Currently nowhere. The regular cdb access is embedded into the Sdb structure itself somehow using mmap instead of loading while this is just a parser that loads a file into memory so it's a somewhat different approach. But a magic detection could be built on top.

thestr4ng3r commented 3 years ago

ping @trufae

thestr4ng3r commented 3 years ago

ping

trufae commented 3 years ago

Some questions/thoughs:

saving an sdb as plaintext is just sdb foo.sdb > foo.txt
loading sdb foo.txt shouldnt work with your PR, afaik. that should be automatically detected
public APIs shouldnt involve FILE*

thestr4ng3r commented 3 years ago

* public APIs shouldnt involve FILE*

Should I make it int fds instead?

thestr4ng3r commented 3 years ago

* saving an sdb as plaintext is just sdb foo.sdb > foo.txt

Currently, not exactly. What this does right now is this sdb_grep_dump() function in main.c which can do different formats, also greps and outputs without escaping and without namespaces. So re-reading that will not always work.

If we wanted to use this "real" plaintext sdb format, we would have to first build the result in memory, because of the grepping, then dump it out and add some more code for json. Should we do that?

* loading sdb foo.txt shouldnt work with your PR, afaik. that should be automatically detected

How do we want to do that? CDB afaik has no magic. We could make the plaintext format however enforce that it must always start with a single / line but that might get unreliable when a user not knowing this adds whitespace before or, even worse, you have a cdb that happens to start with /.

trufae commented 3 years ago

What i was thinking as an alternative is to read the first 32 or 64 bytds from the file and follow some checks:

plaintext if everything is printable and contains at least one = and one newline
its binary if there are several aligned null bytes

trufae commented 3 years ago

The problem witth adding a header in cdb is that it breaks the whole fun of just being a raw thing you map in memory and alignments matter. otherwise adding a +4 to skip the header will involve many ugly changes. i was thinking in adding a footer instead. but there was no use for it... yet

trufae commented 3 years ago

ill submit a proposal api based on this idea of isSdb() {return isBinarySdb()||isTextSdb()}

thestr4ng3r commented 3 years ago

What i was thinking as an alternative is to read the first 32 or 64 bytds from the file and follow some checks:
* plaintext if everything is printable and contains at least one = and one newline

* its binary if there are several aligned null bytes

That would be quite fuzzy and unreliable so more of a thing for just the user interface maybe. But usually it will probably succeed. For checking plaintext, I would also check for /.

So what do we do about the FILE *s? Just replace by fds?

thestr4ng3r commented 3 years ago

so wat do

thestr4ng3r commented 3 years ago

Now using fds.

thestr4ng3r commented 3 years ago

you commit .tmp file

shit. Thanks for noting.

trufae commented 3 years ago

Sorry i had my review without submitting it