Closed thestr4ng3r closed 3 years ago
I'm not super familiar with SDB internals, but by reading the issues it said SDB shall be able to automatically detect if loading a text format or a "compiled" one. Where is this thing done?
Currently nowhere. The regular cdb access is embedded into the Sdb structure itself somehow using mmap instead of loading while this is just a parser that loads a file into memory so it's a somewhat different approach. But a magic detection could be built on top.
ping @trufae
ping
Some questions/thoughs:
* public APIs shouldnt involve FILE*
Should I make it int fds instead?
* saving an sdb as plaintext is just sdb foo.sdb > foo.txt
Currently, not exactly. What this does right now is this sdb_grep_dump()
function in main.c
which can do different formats, also greps and outputs without escaping and without namespaces. So re-reading that will not always work.
If we wanted to use this "real" plaintext sdb format, we would have to first build the result in memory, because of the grepping, then dump it out and add some more code for json. Should we do that?
* loading sdb foo.txt shouldnt work with your PR, afaik. that should be automatically detected
How do we want to do that? CDB afaik has no magic. We could make the plaintext format however enforce that it must always start with a single /
line but that might get unreliable when a user not knowing this adds whitespace before or, even worse, you have a cdb that happens to start with /
.
What i was thinking as an alternative is to read the first 32 or 64 bytds from the file and follow some checks:
The problem witth adding a header in cdb is that it breaks the whole fun of just being a raw thing you map in memory and alignments matter. otherwise adding a +4 to skip the header will involve many ugly changes. i was thinking in adding a footer instead. but there was no use for it... yet
ill submit a proposal api based on this idea of isSdb() {return isBinarySdb()||isTextSdb()}
What i was thinking as an alternative is to read the first 32 or 64 bytds from the file and follow some checks:
* plaintext if everything is printable and contains at least one = and one newline * its binary if there are several aligned null bytes
That would be quite fuzzy and unreliable so more of a thing for just the user interface maybe. But usually it will probably succeed. For checking plaintext, I would also check for /
.
So what do we do about the FILE *
s? Just replace by fds?
so wat do
Now using fds.
you commit
.tmp
file
shit. Thanks for noting.
Sorry i had my review without submitting it
see comments in the beginning of text.c for more info.
Closing issues
Fix #214