Open epage opened 2 years ago
Comment by kbknapp Thursday Sep 14, 2017 at 14:55 GMT
This is a focus of 3.x so stay tuned. I've been able to get some work done in spurts during lulls in my day job (which have been few and far between this year). Watch the 3x-master branch for details :wink:
Comment by kbknapp Sunday Nov 12, 2017 at 20:07 GMT
Personally, I'd like to move away from storing &str
s inside the Arg
s as the name. This would solve lifetime issues as I want 3.x to be able to remove the Arg|App<'a, 'b>
lifetimes. This could be solved by moving to owned String
s which takes a slight performance hit.
clap has two lifetimes for both Arg
and App
which reference the name, or key of the struct and data related to displaying of the help message or errors. clap uses name
and key
interchangeably.
clap used to only accept &'static str
s because that's what 99% of users use. However, some users do dynamically create args and such, so we moved to &'key str
and &'data str
to refer to the name/key and display data. There reason these two are separate is because tying them to the same lifetime was cumbersome when one wanted to use 'static
for one, but dynamically mess with the other.
clap originally could have used an owned String
to hold both the name and data, however for performance reasons we elected not to. I'm considering moving to String for the name/key as these are typically small, but am reluctant to use String
for the data piece as this could be huge and I have yet to see someone who uses anything other than &'static str
for such data.
I want to get away from using &str
for the name/key as this is preventing several key features like the ability to create no-*
args on the fly.
We could move to Cow<'static, str>
for data piece, but so far all attempts to do this have been a hassle. However, I'm OK with it being a hassle if the end result for users is better.
I'm open to ideas.
Comment by epage Tuesday Jul 06, 2021 at 13:41 GMT
I wonder if kstring
would be a good fit. Internally, its an Enum
with variants for
&'static str
I implemented this for liquid
template engine for generating the data context to pass in when rendering. I had a similar situation where most data was 'static
but not all. I wanted a more ergonomic Cow
. I then also threw in small string optimization to bypass String
where possible, because most of my strings were short.
Comment by pksunkara Tuesday Jul 06, 2021 at 18:56 GMT
@epage Thanks for pointing it out. Do you know if there are any performance changes?
Comment by epage Tuesday Jul 06, 2021 at 21:22 GMT
I just ran some of the benchmarks to double check.
For a rough comparison, it looks like Cow::Borrowed
< KString::from_static
< KString::from_string
(small string, 16 or less) < String::from
< KString::from_string
(large string, 17+)
In terms of memory usage:
---- cow::test::test_size stdout ----
Cow: 32
KStringCow: 32
---- r#ref::test::test_size stdout ----
str: 16
KStringRef: 24
---- string::test::test_size stdout ----
String: 24
Box<str>: 16
Box<Box<str>>: 8
KString: 24
Comment by pksunkara Tuesday Jul 06, 2021 at 21:26 GMT
My bad. Let me rephrase, I am concerned about the performance issues related to running time because of the indirections introduced by kstring or similar such library. Also pure Cow
implementation has that IIUC (https://github.com/clap-rs/clap/pull/2504).
I haven't actually worked with something like this before, so my entire theory can be wrong.
Comment by epage Tuesday Jul 06, 2021 at 21:46 GMT
My first comment was meant to give a rough idea of what runtime performance looks like compared to other options according to kstring
s benchmarks. If there is interest in this direction, I can go implement support and run clap's benchmarks. I didn't want to spend time on something if there wasn't interest.
Comment by pksunkara Tuesday Jul 06, 2021 at 21:55 GMT
You don't have to implement it for clap and run the benchmarks. What I am looking for is a small bench implemented purely on a struct containing a string with large number of iterations doing something which involves accessing the string inside the struct and cloning the struct.
With a lifetime str:
struct Test<'a> {
inner: &'a str;
}
Non-ergonomic with cow:
struct Test<'a> {
inner: Cow<'a, str>;
}
With kstring
Once we have these, if we make sure that the performance is not too bad, we can go ahead with implementation.
If the performance is concerning, we can still implement it but should. provide cargo features to let user pick and choose. I haven't looked into kstring, but hopefully it allows something like this.
Comment by epage Thursday Jul 08, 2021 at 14:35 GMT
https://github.com/cobalt-org/kstring/pull/6 has the code used for the benchmarks as well as the numbers with some basic analysis.
The summary is:
clone
of strings <=16 bytes, kstring
is faster than String
but slower than Cow::Borrowed
clone
of strings >16 bytes, kstring
is slower than Cow::Owned
For clone
there seems to be a 10ns overhead that, if it can be diagnosed and fixed, would close the gap with Cow
. I suspect this is fixable or else we would see this same slowdown on deref.
EDIT: I've not updated my charts above but I have dropped the overhead from 10ns to 5ns by not inlining clone()
. My guess is we were thrashing the icache.
Comment by kbknapp Thursday Jul 08, 2021 at 14:48 GMT
The only "downside" my naive glance at kstring
presents to clap
specifically in regards to the spirit of this issue is that not all strings of the App/Arg/etc structs could be made into kstring
s, i.e. the help/about strings can and often do extend well beyond the 16 bytes mentioned (often into the hundreds or thousands of bytes). So if we still have some lifetimes around for those help/about strings, the signature of the struct ultimately remains unchanged for end users.
Performance is another matter entirely though, and we may well benefit from something other than &str
, even if the struct signature doesn't change.
Comment by epage Thursday Jul 08, 2021 at 14:59 GMT
kstring::KString
overflows from inline strings to the heap, so it can handle any arbitrary string. Its effectively a enum { Static(&'static), Inline(...), Heap(Box<str>) }
.
Comment by pksunkara Friday Jul 09, 2021 at 11:16 GMT
The performance actually looks good to me.
If we were to use this, kstring needs to implement From
for all inputs we want to take. I think &'help str
is missing.
Comment by epage Friday Jul 09, 2021 at 13:52 GMT
kstring::KString
implements From for
&'static str
instead of &str
String
&String
(now, looks like I missed this case before)My assumption is that clap argument types break down as:
&'static
String
or &String
(from std::env:var
, format!
)&String
)&str
If that lines up with others thoughts, then KString
covers the majority cases and makes the optimal case easy. In rare cases, someone will have to reach for KString::from_ref
.
EDIT: If not, I guess expanding the scope of From<&str>
could be put behind a non-default feature flag since it would be expanding what types are expected and only have an impact on performance and only then when going to the heap.
Issue by Rupsbant Thursday Sep 07, 2017 at 13:32 GMT Originally opened as https://github.com/clap-rs/clap/issues/1041
Affected Version of clap
All 2.x. Breaking API change
Expected Behavior Summary
I want to decouple groups of options to several structs. I want to have a pair functions: one that adds arguments to an App with generated names, one that generates a structure with the added arguments.
Actual Behavior Summary
Uncompilable: the types don't allow to create String in a function and return a borrow.
Sample code
Proposed solution
Since clap is dependent on its speed it's probably not an option to force 'String' instead of '&str'. However I think Cow should give enough options.
Alternative for me
Create a struct
AParse
to contain the owned strings, split the functionarguments
in two parts. A can generate anAParse
struct before creating the App object.