Closed markus2330 closed 8 months ago
Note: I edited the top command, since user:/%
is not a valid escaped name.
I also thought about and experimented with some alternative APIs:
Only allow setting key name in keyNew
Key * keyNew (elektraNamespace ns, char * parts, ...);
// system:/
keyNew (KEY_NS_SYSTEM, KEY_END);
// system:/elektra/mountpoints
keyNew (KEY_NS_SYSTEM, "elektra", "mountpoints", KEY_END);
// system:/elektra/mountpoints/%
keyNew (KEY_NS_SYSTEM, "elektra", "mountpoints", "", KEY_END);
Pro:
Con:
KEY_VALUE
, KEY_META
, etc. have to translated into explicit function calls.ksNew(10, keyNew (...), keyNew (...), ..., KS_END)
directly.Use unescaped name and size directly:
Key * keyNew (elektraNamespace ns, char * parts, size_t partsSize);
#define keyNewLiteral (ns, parts) keyNew ((ns), (parts), (sizeof(parts)));
// system:/
keyNew (KEY_NS_SYSTEM, NULL, 0);
keyNew (KEY_NS_SYSTEM, "", 1);
keyNewLiteral (KEY_NS_SYSTEM, "");
// system:/elektra/mountpoints
keyNew (KEY_NS_SYSTEM, "elektra\0mountpoints", 20);
keyNewLiteral (KEY_NS_SYSTEM, "elektra\0mountpoints");
// system:/elektra/mountpoints/%
keyNew (KEY_NS_SYSTEM, "elektra\0mountpoints\0", 21);
keyNewLiteral (KEY_NS_SYSTEM, "elektra\0mountpoints\0");
Pro:
KEY_VALUE
, KEY_META
, etc.keyNew
is more or less a single memcpy
Con:
parts
and partsSize
must match, macro only works with string literals.More complicated API:
Key * keyFromParts (elektraNamespace ns, const char ** parts);
#define elektraKey(ns, ...) keyFromParts ((ns), ((const char *[]){ __VA_ARGS__, NULL }))
KeySet * ksFromArray (size_t alloc, Key ** keys);
#define elektraKeySet(alloc, ...) ksFromArray ((alloc), ((Key *[]){ __VA_ARGS__, NULL }))
See here for example usage (ignore the obvious memory leaks).
Pro:
ksFromArray
allows using a single memcpy + qsort
, which should be faster than calling ksAppendKey
repeatedly (essentially insertion sort)Con:
KEY_VALUE
and KEY_META
are still difficult to implement via macros, for easy calling of elektraKey
in elektraKeySet
Variant of 3
typedef Key MetaKey;
Key * keyCreate (elektraNamespace ns, const char ** parts, const char * value, MetaKey ** metaKeys);
static inline MetaKey * keyCreateMeta (const char ** parts, const char * value) {
return keyCreate(KEY_NS_META, parts, value, NULL);
}
KeySet * ksFromArray (size_t alloc, Key ** keys);
#define elektraArray(type, ...) ((type[]){ __VA_ARGS__, NULL })
See here for example usage (ignore the obvious memory leaks).
Pro:
ksFromArray
allows using a single memcpy + qsort
, which should be faster than calling ksAppendKey
repeatedly (essentially insertion sort)Key
consists of namespace, parts, value and metaKeys and a KeySet
is a container of keys.Con:
keyCreateMeta
helps but it's still very clunkyNote: elektraArray
could be put into an internal header and we recommend to create a separate macro for external code, if e.g. keyCreate (KEY_NS_SYSTEM, (const char *[]){ "elektra", NULL }, NULL, NULL)
is to verbose for the external user.
I don't have strong preference for any of these APIs. My only wish is that keyNew
doesn't use escaped names. The validation, canonicalization and unescaping process is quite expensive and replacing to with a few dumb memcpy
would likely improve performance. Of course, I don't have benchmarks to prove this with any significance, but I think in this case we can go on gut-feeling (complicated analysis vs. memcpy
seems quite obvious).
In the ideal case, we could even move the escaped name stuff into separate library.
Yes, I agree it would be great if the escaped name stuff could be kicked out of the core. I am not sure if it is realistic, though. Aren't there too many clients needing some way to give more than one name part via one char*
? E.g. what about keyAddName
?
Only allow setting key name in keyNew
I like the idea best from the options but why limit it to setting key name? Everything else could be simply moved to the beginning:
keyNew (KEY_VALUE, "5", KEY_META, "8", KEY_NS_SYSTEM, "elektra", "mountpoints", KEY_END);
what about
keyAddName
?
keyAddName
(possibly renamed) could be moved to the new library. I think for most clients (that don't involve user interaction) keyAddBaseName
should be good enough.
The exception of course are non-canonical names e.g. keyAddName (k, "..")
or keyAddName (k, "#10")
:
keyAddBaseName
and elektraWriteArrayNumber
or we could also add a special keyAddArrayPart
that takes an integer.keyAddName (k, "..")
is much better handled by keySetBaseName (k, NULL)
keyAddName (k, "../..")
(or more ..
) can be handled by multiple keySetBaseName
or we could add a new keyRemoveParts (k, n)
that removes n
parts from k
.The only reason I can think of that the escaped name would be needed, is user interactions (printing, user input, etc.). In that case linking against a second library should be acceptable.
Everything else could be simply moved to the beginning
I find this version quite hard to read. If there are a lot of meta keys, the key name will be very far down the invocation of keyNew
. On top of that (especially with our clang-format settings) it might be hard to spot where the key name starts, if it spans across a few lines.
Take for example this invocation from examples/opts.c
:
keyNew (SPEC_BASE_KEY "/emptydirs", KEY_META, "description", "remove empty directories", KEY_META, "opt", "d", KEY_META, "opt/arg", "none", KEY_META, "opt/long", "dir", KEY_END)
That would translate to:
keyNew (KEY_META, "description", "remove empty directories", KEY_META, "opt", "d", KEY_META, "opt/arg", "none", KEY_META, "opt/long", "dir", KEY_NAME, KEY_NS_SPEC, "sw", "org", "erm", "#0", "current", "emptydirs", KEY_END)
(Note: we need a separate KEY_NAME
option otherwise the KEY_NS_*
and KEY_*
values could collide).
As a second wish I would add: I'd prefer to not have a var-args based API. It is not only quite error-prone to use, but it is also pretty clunky to implement and maintain. I also don't know what implications var-args have on ABI compatibility.
keyAddName (possibly renamed) could be moved to the new library. I think for most clients (that don't involve user interaction) keyAddBaseName should be good enough.
For hierarchical storage plugins, I agree. But I am afraid at the moment nearly every other client needs some way to construct a key name from a single string (e.g. we saw such a code recently in qt-gui). I do not see a realistic way how to rewrite all of this. Do you have any idea about how to do the migration?
The only reason I can think of that the escaped name would be needed, is user interactions (printing, user input, etc.). In that case linking against a second library should be acceptable.
I agree that in these cases linking is no problem. I am not so optimistic, that these are the only cases though. E.g.:
keyAddName
.This is only a small set of examples in a quick look-through. It looks like extensive work to fix all of this. And in most cases, the library would be needed. And we already have quite many libraries, one more that would need to get linked virtually anywhere seems to be a bit pointless.
Especially, also all the bindings need it, as they also all provide a way to create a key name from a single string. E.g. the cpp binding up-to-now by definition only exposed core functionality.
I find this version quite hard to read
I agree.
As a second wish I would add: I'd prefer to not have a var-args based API. It is not only quite error-prone to use, but it is also pretty clunky to implement and maintain. I also don't know what implications var-args have on ABI compatibility.
Yes, also bindings cannot interact with the vaargs interface (except of invocations with pre-defined list of parameters). This was a big problem in the beginning but a good solution was found with keyVNew and KEY_FLAGS.
So at the moment we are quite at a sweet spot with vaargs that both:
So going away from the sweet spot only makes sense if:
we saw such a code recently in qt-gui
The Qt-GUI is user-interaction, but yes I see that linking another library just for keyAddName
might not be a good solution. However, we could still move anything that uses escaped names out of kdb.h
(and explicitly not include any headers that use escaped names in kdb.h
) to encourage people to avoid escaped names.
in some storage plugins where "/" should be interpreted as hierarchy, we simply pass the name using keyAddName.
In those cases it would actually be better to do what quickdump
does and copy into a common buffer and then call keyNew
.
even the high-level API uses keyAddName to construct new key names using always the same lookup key
That is a good point, although at least the generated version of the high-level API should be moved to unescaped names, once the remove the escaped name from struct _Key
.
So at the moment we are quite at a sweet spot with vaargs that both:
I think the version with arrays would work much better for bindings.
we have a good plan how to actually do the labor-intensive migration
I don't think there will ever be a good plan for migrating such a core API as keyNew
. The only solution is another huge PR like #3555. The task would be a bit simpler, since finding all invocations of keyNew
is a lot simpler than finding all uses of key names (just rename keyNew
to keyNew2
in kdb.h
and see what doesn't compile/link).
But since we are just dealing with basic function invocations, it should theoretically be possible to write an upgrade script. This series of articles seems to describe how it could be done with clang tooling. In fact just by using clang-query
it's pretty easy to find all invocations of keyNew
.
In those cases it would actually be better to do what quickdump does and copy into a common buffer and then call keyNew.
Why would that be better? If you use keyAddName
you do not have any hassles with memory management.
I think the version with arrays would work much better for bindings.
I do not see how this could improve the current bindings? And there is already a lot of knowledge how to make bindings for the API as-is, I do not see a need for improvement.
I don't think there will ever be a good plan for migrating such a core API as keyNew
About the initial proposal: The name KEY_BASE_NAME
, might better describe what it does and is more consistent with keyAddBaseName
. That it will get added is obvious. But actually adding this feature is very low priority, as it only makes keyNew, that is already a feature-monster, even more bloated.
I like the plan to first fix the leftovers of the keyname overhaul, release that and then get rid of stuff (like key
in _Key
) without adding even more stuff.
Why would that be better? If you use
keyAddName
you do not have any hassles with memory management.
It's C, you always have to deal with memory management. But to give a bit more context:
keyDup (parent); keyAddName; keyDel
or similar sequences. You can instead to a single keyNew
.I do not see how this could improve the current bindings?
It might not improve current bindings, since they are already written. But as you said, bindings for varargs functions are hard to write (sometimes even impossible, e.g. the go bindings needs custom C wrappers). Calling a function that takes an array, will always be possible.
I do not see a need for improvement
There may not be a need, but it would be positive side effect.
I also don't see any positives in using var-args, except shorter code. But shorter code at the expense of type safety and compile time errors, is not something I'm a fan of.
The name
KEY_BASE_NAME
, might better describe what it does and is more consistent withkeyAddBaseName
Could you add some examples for adding multiple key name parts. e.g. what's the replacement for keyNew ("system:/elektra/mountpoints", KEY_END)
?
The name KEY_BASE_NAME
would only make sense to me, if you have to use it before every key name part. So it would be:
keyNew ("system:/", KEY_BASE_NAME, "elektra", KEY_BASE_NAME, "mountpoints", KEY_END);
That it will get added is obvious.
To me and you it might be, but I wouldn't say that's the case in general. It could just as well, be the case that the above like results in system:/mountpoints
, because the second KEY_BASE_NAME
overrides the first (i.e. corresponds to keySetBaseName
).
I like the plan to first fix the leftovers of the keyname overhaul, release that and then get rid of stuff (like key in _Key) without adding even more stuff.
Yes, of course. This issue should be solved pre-1.0, but definitely not for the next release.
It's C, you always have to deal with memory management.
I mean something like:
I also don't see any positives in using var-args, except shorter code.
Yeah, the idea was to have the possibility of (generated) C-Code which you can include and assign to a KeySet
variable easily. I don't know if it was really worth all the hassle but now it is done and works (see also c
plugin). I am not sure if it is a selling point, but it has some elegance.
Could you add some examples for adding multiple key name parts
Yes, I changed it in the top post.
The name KEY_BASE_NAME would only make sense to me, if you have to use it before every key name part.
Yes, exactly, this was the idea.
To me and you it might be, but I wouldn't say that's the case in general.
I added the ADD_
now.
I mean something like:
Yes, in this case the memory management is hidden inside the library.
Sidenote: That line will actually fail horribly, if elektraNi_GetName
returns an invalid escaped name (suffix). The key name of k
won't be changed and the ksAppendKey
a few lines down will just silently replace the parent key in the KeySet. Using a single keyNew
without checking for errors will at least segfault, because of the returned NULL
and not potentially corrupt the data in the KDB.
Yeah, the idea was to have the possibility of (generated) C-Code which you can include and assign to a KeySet variable easily.
Option 4 in https://github.com/ElektraInitiative/libelektra/issues/3529#issuecomment-738738878 fulfils this requirement as well. It's not as compact, but has more type safety. Most of all, if you use the elektraArray
macro, you can never forget sentinel values anywhere.
That line will actually fail horribly
Yes, this is in the currently sad topic of error handling, I hope Stefan (who will work on the API) will find nice guidelines for this.
Segfault is definitely not a good option but failing is also not a good option, as it would make it impossible to fix any errors #1511.
Option 4 in #3529 (comment)
But it uses vaargs in macros, which is even worse for bindings than the current situation.
failing is also not a good option, as it would make it impossible to fix any errors
In this case it is the correct option. If the file contains invalid key names, you either edited the file manually (in which case you can do that again to fix the problem) or the plugin is completely broken and actually produced the invalid key name (in which case there is no chance of fixing the key name anyway).
But it uses vaargs in macros, which is even worse for bindings than the current situation.
True macros are very bad for bindings. But macro is only there to add the NULL
. The API totally works without the macro as well and maybe it's eve better without the macro.
I just think that the array API is much more versatile, since you can construct arrays dynamically while var-args and va_list
are pre-defined at compile-time.
The API totally works without the macro as well and maybe it's eve better without the macro.
This changes everything, then at least the part with the ksNew
replacement is a very interesting option. I like the idea with sorting and dynamically-sized key sets. It would be also interesting for build-in specs.
Can you give an example how it would look like without the macro?
pre-defined at compile-time.
Which would not be a bad thing if the compiler could optimize something but in this case it is actually making it slower as qsort is not possible on va_list
.
Can you give an example how it would look like without the macro?
Basically, ignore the macro and write the array casting bit yourself. I added an example to the showcase (with_macros
vs without_macros
).
The thing I don't like without the macro is, that it is far too easy to forget the NULL
at the end of the array. In e.g. C++ this wouldn't be an issue, since we could use initializer_list
which has a size
field. That's why I would add additional versions of keyCreate
/ksFromArray
that take the sizes as arguments instead of (just) looking for NULL
. These could of course be used from C, but would mainly be intended for bindings.
Basically I would do:
Key * keyCreate (elektraNamespace ns, const char ** parts, const char * value, const MetaKey ** metaKeys) {
return keyCreate2 (ns, parts, 0, value, metaKeys, 0);
}
Key * keyCreate2 (elektraNamespace ns, const char ** parts, size_t partCount, const char * value, const MetaKey ** metaKeys, size_t metaKeyCount);
keyCreate2
takes at most partCount
parts from parts
, if partCount > 0
and always stops at the first NULL
in parts
. (same for metaKeys
)
A similar thing would be done for ksFromArray
.
The thing I don't like without the macro is, that it is far too easy to forget the NULL at the end of the array.
And it doesn't yield any warning/error? Is there a possibility to make a sentinel for that?
If we really would like to add a macro to the API it would need to be called ELEKTRA_CREATE_KEYSET
or similar. But as this type is needed as first argument, the macro is also not ideal.
Am I correct that the API would be C89 but to actually create a KeySet within one expression C99 is needed?
keyCreate2
I am against having a duplicate of a function in the 1.0 API. The *2
approach is at best an ugly hack for mistakes of the past.
And it doesn't yield any warning/error? Is there a possibility to make a sentinel for that?
AFAIK there is no way to tell the compiler to check for NULL
in an array. I wouldn't even know how that would work. We are talking about C after all... The only reason we need the NULL
is that without it nobody knows where the array ends.
If we really would like to add a macro to the API it would need to be called ELEKTRA_CREATE_KEYSET or similar. But as this type is needed as first argument, the macro is also not ideal.
Like I said, at the beginning. We could add the macro into an internal header and recommend users create their own, if they need one.
Am I correct that the API would be C89 but to actually create a KeySet within one expression C99 is needed?
Yes, the (const char *[]){ "elektra", NULL }
syntax is C99+ only. But AFAIK we said C99 is required for some parts already and will be the minimum version for Elektra 1.0. I really don't think anybody still uses a compiler that doesn't have C99 support. At least not for Elektra.
I am against having a duplicate of a function in the 1.0 API. The *2 approach is at best an ugly hack for mistakes of the past.
The name keyCreate2
was just a placeholder. Something like keyCreateInternal
might be more suitable. It better suggests that the API is intended for used internally (e.g. within bindings) and not directly by the user.
Of course we could also just use the keyCreate2
directly as keyCreate
:
Key * keyCreate (elektraNamespace ns, const char ** parts, size_t partCount, const char * value, const MetaKey ** metaKeys, size_t metaKeyCount);
But creating a simple key would be quite a long function call:
// system:/
keyCreate (KEY_NS_SYSTEM, NULL, 0, NULL, NULL, 0);
// system:/elektra
keyCreate (KEY_NS_SYSTEM, (const char *[]){ "elektra", NULL }, 0, NULL, NULL, 0);
keyCreate (KEY_NS_SYSTEM, (const char *[]){ "elektra" }, 1, NULL, NULL, 0);
Maybe the solution is a more nested approach (optional macros for usability, static inline
modifier also optional):
#define keyNew (ns, ...) keyCreate (ns, (const char *[]) { __VA_ARGS__, NULL })
Key * keyCreate (elektraNamespace, const char ** parts);
static inline Key * keyWithValue (Key * k, const char * value) {
keySetString (k, value);
return k;
}
#define keyWithMeta (k, ...) keyWithMetaArray (ns, (Key *[]) { __VA_ARGS__, NULL })
static inline Key * keyWithMetaArray (Key * k, Key * metaKeys) {
ksAppend (keyMeta (k), ksFromArray (0, metaKeys));
return k;
}
#define ksNew (alloc, ...) ksFromArray (ns, (Key *[]) { __VA_ARGS__, NULL })
KeySet * ksFromArray (size_t alloc, Key * keys);
This could be used like this:
KeySet * ks = ksNew (1,
keyWithMeta (
keyWithValue (keyNew (KEY_NS_SYSTEM, "elektra", "mountpoints"), "value"),
keyWithValue (keyNew (KEY_NS_META, "comment"), "comment value"),
keyWithValue (keyNew (KEY_NS_META, "type"), "string")
)
);
To create (ni syntax):
system:/elektra/mountpoints = value
[system:/elektra/mountpoints]
comment = comment value
type = string
It's a bit more verbose for keys with value and metadata, but it is more compact and easier to use for keys with just a name.
And I what to stress again: I'm just throwing ideas out there. The only preferences I have for the new API are "preferably no varargs" and "no escaped names".
In https://github.com/ElektraInitiative/libelektra/pull/3606#issuecomment-753629132 the idea for a keyEmpty()
function came up. @markus2330 responded:
We can add something like this in private headers or in libease. But I do not think it should be in the public core API. The proper name would be keyNewCascading and then people ask why there is no keyNewUser and so on...
I disagree that this would be the "proper" name. The idea behind keyEmpty()
is that it creates the most basic Key
possible, containing no data whatsoever.
I think this should be part of the public API for multiple reasons:
keyDup (k)
would be keyCopy (keyNew ("/", KEY_END), k, ~0)
. Which is much harder to understand. keyCopy (keyEmpty(), k, ~0)
seems a lot more clear ("copy k into an empty key" = "duplicate k").keyNew
is implemented (whichever API we choose), things will just get worse. keyNew ("/", KEY_END)
could become one of:
keyCreate (KEY_NS_CASCADING, NULL, 0, NULL, NULL, 0)
keyNew (KEY_NS_CASCADING)
keyFromParts (KEY_NS_CASCADING, NULL)
keyNew (KEY_NS_CASCADING, KEY_END)
keyNew (KEY_NS_CASCADING, NULL, 0)
"/"
and KEY_NS_CASCADING
are the "correct" names for an empty/dummy Key
. This matters, because this could be optimised. Using the old API keyNew ("/", KEY_END)
would only allocate a new struct _Key
, but no key names, values or metadata. Theoretically we could go even further and make keyNew ("/", KEY_END)
return a static instance. This would of course require changing all code that modifies a key (keySetString
, keySetName
, etc.) to clone this static instance when needed, similar to what we do with KEY_FLAG_MMAP_*
.The idea behind keyEmpty() is that it creates the most basic Key possible, containing no data whatsoever.
But it contains the name of the root of the cascading namespace. This is imho not empty. In general we are prone to introduce too much terminology. I think we should drop the whole idea of "empty keys", we now ensure there is always a name and can be proud of this :wink:
Maybe we can call it keyNull()
: it has a name but the data is null. Not really thrilled about it, though. Additionally, I do not really understand the timing: now you renamed all the occurrences to keyNew ("/", KEY_END)
and you want to rename them again? For me, with the current API keyNew ("/", KEY_END)
feels to be the correct way.
keyEmpty()
is just a name. I'd be fine with another name like keyNull()
as well. (Although I don't quite understand how Null vs. Empty makes a difference).Key
created by keyEmpty()
is as empty as it can be. A Key
cannot have no value and it cannot have no name. The closest you can come is keyNew ("/", KEY_END)
(or maybe from a user POV keyNew ("/", KEY_BINARY, KEY_END)
so that keyValue
returns NULL
).Key
without a name. If I don't care about the name, the correct one to use (for future optimization purposes) is "/"
. To make that more explicit a function like keyEmpty()
would be helpful.keyNew ("/", KEY_END)
immediately. The switch to keyEmpty()
would be made together with the implementation of this issue, which is about a completely new API for keyNew
. So adding keyEmpty()
wouldn't really add any work. (Sidenote: sed 's~keyNew ("/", KEY_END)~keyEmpty()~g'
should work, so it really doesn't matter when we do it)"Null key" is an established term yes. But this actually a reason to not call the function keyNull()
. The point is not creating a key without a value. The point is a simple solution for the situation: "I just want an instance of Key
. I don't care what it contains." Maybe keyBare()
or keyBlank()
? ("bare" or "blank" because the resulting Key
contains the bare minimum bytes, i.e. it is essentially blank)
it gives you two options to do the same thing.
Because it is an alias, I don't see it as too options. The idea is that keyEmpty()
(or whatever the name will be) will always be defined as:
static inline keyEmpty() { return keyNew ("/", KEY_END); }
This line would just be added to kdb.h
and thereby the whole line (including the definition of keyEmpty()
) would become public API.
Yes, there are two options in the sense, that users can write keyEmpty()
or keyNew("/", KEY_END)
. But the point is we guarantee that this will always result in the same behaviour, so it is purely a matter of preference and not a meaningful decision.
The statements "I just want an instance of Key. I don't care what it contains." and to provide the very strong guarantee "that [two statements] will always result in the same behaviour," are very contradicting goals. Together with issues in terminology, I think we should drop the idea to make such an alias.
Where exactly do you see the contradiction?
"I just want an instance of Key. I don't care what it contains." implies a call to keyNew
since that is the only way to create an instance of Key
. There are no other implications since we don't care what the instance contains.
"that [keyEmpty()
and keyNew("/", KEY_END)
] will always result in the same behaviour," is a completely separate concern. First of all, this can only be guaranteed as long as keyNew("/", KEY_END)
is valid. If the keyNew
's signature changes, this guarantee changes too (e.g. replace keyNew("/", KEY_END)
with keyNew(KEY_NS_CASCADING)
).
The guarantee can simply be achieved by adding
static inline keyEmpty() { return keyNew ("/", KEY_END); }
to kdb.h
and never changing this line. (again, unless keyNew
's signature changes)
I really don't see what the huge problem is, but since you seem absolutely convinced that adding keyEmpty()
is a bad idea, I guess we won't do it.
Intuitively it sounds like an empty Key should look something like keyNew("", KEY_END)
(so without the "/"). Because of this it would be good to have keyEmpty()
to make it very clear how to get an empty key.
But re-thinking my own code.. are there any situations where you really need an empty key? Thinking about it I think I could rewrite everything to just create/initialize the Key at the right point and avoid empty Keys completely. Am I overlooking something?
But re-thinking my own code.. are there any situations where you really need an empty key?
I hope you never really need it, as it does not exist anymore. As every key now has a name, only the value can be "empty", we call these keys "null keys", though.
an empty Key should look something like
keyNew("", KEY_END)
That doesn't work since #3555, because ""
is not a valid (escaped) key name anymore.
are there any situations where you really need an empty key?
The most common situation currently is when a Key
is used for error reporting only (e.g. kdbOpen
).
Once we remove escaped names and change keyNew
(that's the whole reason why I started the discussion in this issue), however, the most common situation by far will be what is now solved by keyNew (keyName (k), KEY_END)
. This has to be replaced by keyCopy (keyNew ("/", KEY_END), k, KEY_NAME)
(that's why I originally had the idea in #3606).
At least in the code that I wrote, things like keyNew (keyName (k), KEY_END)
happen quite a bit.
Maybe another solution would be to allow keyCopy (NULL, k, flags)
as and alias for keyCopy (keyNew ("/", KEY_END), k, flags)
.
Maybe another solution would be to allow keyCopy (NULL, k, flags) as and alias for keyCopy (keyNew ("/", KEY_END), k, flags).
I like this much more.
I mark this issue stale as it did not have any activity for one year. I'll close it in two weeks if no further activity occurs. If you want it to be alive again, ping the issue by writing a message here or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart:
@markus2330 is a change like this still in discussion? For some versions automatic refactoring may be possible.
Also, with the new idea of a libelektra-operations
library, the best variant for a minimal core API might be a basic:
// returns a new Key with the name set to "/"
Key * keyCreate (void)
// returns a new KeySet with an initial allocation size of `alloc`
KeySet * ksCreate (size_t alloc);
It would of course mean that creating Keys and KeySets with just libelektra-core
would need more code and couldn't be done inline. But it means that every alternative implementation can easily match this API (no variadic args needed). Additionally they can provide another API on top that is more idiomatic to the language.
For C this means providing a version of keyNew
(either one with variadic arguments or maybe we macros) in libelektra-operations
. As a first step keyNew
would just be moved from core to operations. But after that we may want to think about a new API again to avoid escaped names.
For Rust it would probably be a builder-style interface, or something using macros.
As a bonus this also makes bindings for libelektra-core
easier. Many languages struggle with C variadic arguments, but can easily bind to the function suggested above. In fact, most of our current bindings try to ignore the variadic arguments and just pass a keyname and then call setters for value and metadata.
It would be quite interesting, especially because VAList is a bit iffy in Rust, although I got it to work now I still have some issues with it (such as reading memory from stack which I shouldn't be reading). The simplest way, which I think also would work would be just removing the variadic arguments and only having name as a required parameter, all other properties of the Key should be manipulated by the respective functions (keySetValue()
, ...).
Just passing the name (and separately the namespace) was my idea at first too, since every Key
must have a name. However, this means one of three things needs to be the case:
NULL
at the end. With variadic arguments, the compiler can enforce the sentinel NULL
value, but we wanted to avoid var-args. More importantly, this now means that we need to assemble the unescaped name from the parts. This involves multiple memcpy
s and either multiple malloc
s or multiple iterations of the parts list."foo\0bar"
): The best of a bunch of bad options. We still need a separate argument for the size of the string, but at least it would now be a single malloc
and memcpy
to store the actual key name.So I just concluded, put the absolute bare minimum into libelektra-core
and do something specific for each language.
Note: All of this assumes, that we only store the unescaped name.
PS. In Rust you could still do something like below on top of the basic libelektra-core
interface. This is essentially option 3, but much nicer, because Rust knows the length of a slice.
trait Key {
fn new(ns: Namespace, name: &[u8]) -> Self;
fn set_value(&mut self, value: &[u8]) -> Self;
fn set_meta(&mut self, name: &[u8], value: &[u8]) -> Self;
}
Note: I haven't done that much in Rust, so the code might not work (error handling is definitely missing). But the syntax should be correct enough to get the idea across.
In this case the name is also &[u8]
? I always assumed the name to be a string. Or is this something you want to change as well?
Yes, in Rust I would go for something like that. I really dont like having to pass a length parameter, I assume this is due to the name param no longer being a string? Are there any upsides I am missing in changing name to be a binary value as well?
In this case the name is also
&[u8]
? I always assumed the name to be a string. Or is this something you want to change as well?
This is a consequence of me wanting to avoid the escaped name as much as possible (since unescaping can be expensive). Since the unescaped name contains \0
bytes &[u8]
is probably more correct.
I assume this is due to the name param no longer being a string? Are there any upsides I am missing in changing name to be a binary value as well?
Exactly, with the escaped name we can just pass a zero-terminated char *
. But for the unescaped name this doesn't work. We need to know the length and C this needs another parameter.
The advantage of using the unescaped name directly is that in the ideal case (which would be possible with the Rust API), you just need to copy the name into the struct with a single memcpy
.
For most escaped names the unescaped name isn't too expensive either (*), but when you have something like user:/foo/bar/../../#100/../foo\/bar/#200
instead of just user:/foo\/bar/#__200
, then it gets expensive.
(*) the name still needs to be iterated multiple times and is copied character by character instead of as a single memcpy
.
@markus2330 is a change like this still in discussion? For some versions automatic refactoring may be possible. We pass the unescaped name as as single string with embedded zero bytes (e.g. "foo\0bar"): The best of a bunch of bad options.
:+1:
We still need a separate argument for the size of the string, but at least it would now be a single malloc and memcpy to store the actual key name.
Not if we terminate with two \0
?
So I just concluded, put the absolute bare minimum into libelektra-core and do something specific for each language.
:+1:
Not if we terminate with two
\0
?
Empty key name parts are allowed and appear as \0\0
in the key name. In general a sequence of N \0
bytes will result in N-1 empty parts.
To restate the definition: A key name is an arbitrary sequence of bytes that fulfils exactly three conditions:
\0
.\0
. (This is automatically added by the implementation, so shouldn't be passed in)Note: AFAIK it is not currently implemented like this. I think there is still an issue (didn't find it), for the fact that namespace roots have 3 bytes right now (namespace, and two zero bytes).
That is why I proposed the Rust API new(ns: Namespace, name: &[u8])
. This directly shows that we need a namespace and an arbitrary sequence of bytes. The \0
bytes from 2 and 3 above, are always there and will be added by the implementation of new
, there is no reason why they should be passed in.
Sorry, yes, two \0 is not enough but it needs to be a special magic sequence not allowed otherwise, i.e. @elektra@end
or similar. The same problem exists for keySetName. Let us discuss this today.
Discussion result: The end marker is bad for developer experience (annoying to type), creates bloat in the binary (you need it in every keyNew
call), makes the source code harder to read and adds an unnecessary restriction. We will instead use a separate size argument in the C API. Other languages should use a String or array-like type with a length field, if available.
I mark this stale as it did not have any activity for one year. I'll close it in two weeks if no further activity occurs. If you want it to be alive again, ping by writing a message here or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart:
See #4715 and #4805
I mark this stale as it did not have any activity for one year. I'll close it in two weeks if no further activity occurs. If you want it to be alive again, ping by writing a message here or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart:
I closed this now because it has been inactive for more than one year. If I closed it by mistake, please do not hesitate to reopen it or create a new issue with the remainder of this issue. Thank you for your contributions :sparkling_heart:
If we decide to keep vaargs for keyNew, we should consider adding something like
KEY_ADD_BASE_NAME
for constructions of special names without the need to know about the escaping rules, e.g.:keyNew("system:/", KEY_ADD_BASE_NAME, "elektra", KEY_ADD_BASE_NAME, "mountpoints", KEY_END)
forsystem:/elektra/mountpoints
.