Closed MattesWhite closed 4 years ago
Actually, this discussion about Term<Cow<str>>
has been on my mind lately, and I just pushed some work (39733c4a) in that spirit.
Cow<str>
, called MownStr
(enjoy the pun), for "maybe owned string".MownTerm<'a>
alias for Term<MownTerm<'a>>
Resolve
in the process (it is now infallible in most cases).Resolve
can now produce MownTerm
in addition to Term<TD2>
(NB: there is no conflict, because TD2
is bound to for<'x> From<&'x str>
, which MownTerm
does not implement).I think this addresses most of the points you raise above.
Nice work I'll try integrate it. Maybe MownStr
is worth its own crate ...?
While refactoring #61 for MownStr
I discovered that normalization has an equivalent issue as resolve. Should there be a similar two-way-implementation with one extra for MownStr
? I think we already had a similar discussion during #49 where I switched to only Cow<str>
witch you denied because it was no longer possible to use TermFactory
.
Yep, this also came to my mind. I think it would make sense to have
pub fn normalized(&self, policy: Normalization) -> MownTerm
next to clone_normalized_with
.
Do you still want to have a factory
in the mown-version? I think the _with(factory)
methods are mostly used with TermFactory
and this is another approach to avoid copies than MownStr
. In addition, we would actually require two factories for MownStr
: fn(&'a str) -> MownStr<'a>
and fn(String) -> MownStr<'_>
.
On further thinking, it may even make sense to add those two functions to TermFactory
and instead of passing a closure to _with(factory)
methods pass an impl TermFactory
. By passing an owned String
to a TermFactory
it could make use of the existing allocated data to create a new Rc
, Arc
or whatever, saving another copy with zero costs. I think it should also be possible to then implement TermFactory
for (fn(&str) -> TD, fn(String) -> TD)
.
Note: Requiring that fn(String) -> TD
is possible would mean that &str
could not be used as TD
in such a context.
Do you still want to have a factory in the mown-version?
No, of course not! I fixed it in my comment. Thanks.
it may even make sense (..) instead of passing a closure to
_with(factory)
methods pass animpl TermFactory
Well, not currently, because the methods in TermFactory
rely on the _with(factory)
methods...
By passing an owned
String
to aTermFactory
it could make use of the existing allocated data to create a newRc
,Arc
.
I thought so at first, but actually no: Rc<str>
( and Arc<str>
) reallocate the str
data (together with its ref-counting data) (see this example).
Using Rc<String>
instead of Rc<str>
would make this kind of optimization possible, but might introduce additional hidden cost (fragmenting the memory more, longer deallocation...). Furthermore, the primary goal of TermFactory
is to avoid allocations, and reuse an existing TermData
instead, so I'm not sure about the ROI if such optimization...
Actually, I'm currently trying some refactoring on TermFactory
, which might address your concerns. Will probably push later this afternoon...
Done, the commit referenced above, as well as aff86514, make it easier to convert terms while limiting the number of (re-)allocations. This relies largely on combining MownStr
/MownTerm
with the map
methods you introduced in Term
, which were definitely a super good idea :)
Does that fit your needs?
In some situations intermediate
String
s are allocated and then transformed intoTermData
, e.g. normalization and resolving of IRIs.Issue
I'm currently working on
metis
for an example for #55. I'd like to useCowTerm
for my parser. This means that if an absolute IRI is parsed I haveCow::Borrowed
and I would like it to remainCow::Borrowed
after resolving against the base IRI. On the other hand when a relative IRI is parsed theCow::Borrowed
should be turned intoCow::Owned
after resolution. This means that I have to know if a&str
comes from the original to track lifetimes or if I get a newly allocatedString
.Discussion of Solutions
Resolve<Iri>
(and of normalization) to:This, however, prevents passing in a
TermFactory
. So maybe it would be nice to have anotherResolve
.resolve_with()
:This adds complexity and is maybe to much for the single use case of
Cow
as this is not that important for otherTermData
. Besides a default implementation:Should be possible.
What do you think? Is this case important enough to add such complexity?