Closed mitsuhiko closed 1 year ago
One other thing is that the engine has a few places where that basic model does not work well:
Token::Str
holds a Cow
so that it can store unescaped stringsInstruction::Lookup
and Instruction::GetAttr
also get static strings at times&'source str
and with that change they would need to hold a String
instead since we could not adjacently borrowI think a model that could work is to have a StrCache<'source>
which is used everywhere where currently &'source str
is used to hold the source and then a few StrHandle
given out by the StrCache<'source>
. For as long as also for release builds an internal ID is given out that could be made safe without incurring bounds checks. For the few places where a &'static str
is placed there, I think the handle could be allowed to store references to static strings or specific instructions are used instead. For the the Cow
case that only exists in the Token
but never in Instruction
so that problem is much smaller in comparison. In the instructions it's already converted into a Value
which clones.
I made some small changes internally that address some of these issues. In d53160d3ff7f54e11696d804708a7e3a7e4b5c2b I removed static strings added to instructions for loop internals and in 6e60023395ad3b6fe20aec9b843a65dda8978c54 I removed the Cow
from the string token.
I had back out one of these due to performance regressions.
I have a new way now in which I think I want to go about this. The 'source
lifetime is quite useful for multiple scenarios so it does not need outright removing. What actually however is annoying is that you don't have a convenient way to add owned strings into the environment. This runs against a bunch of lifetime issues, but that is already true for the Source
feature too.
I think what I might be doing for MiniJinja 1.0 and to close off this issue is the following:
source
feature is enabled, the Engine always holds a Source
object internally to replace the template mapEnvironment::add_template
stays and borrowsEnvironment::add_owned_template
(name TDB) is does what today is env.source_mut().add_template(...)
Source
object internally.Since after you can add owned templates there is not a lot of reason to actually expose the Source
object. The main of the Source
object which at that point is internal is to set a loader:
Before:
let mut env = Environment::new();
env.set_source(Source::with_loader(|name| {
if name == "layout.html" {
Ok(Some("...".into()))
} else {
Ok(None)
}
}));
After:
let mut env = Environment::new();
env.set_loader(|name| {
if name == "layout.html" {
Ok(Some("...".into()))
} else {
Ok(None)
}
});
And for owned templates:
Before:
let mut env = Environment::new();
let mut source = Source::new();
source.add_template("index.html", "...").unwrap();
env.set_source(source);
After:
let mut env = Environment::new();
env.add_owned_template("index.html", "...").unwrap();
Today the engine holds
&'source str
all over the place in the instructions and by extension in the templates. This is a design pillar that I originally inherited from the initial API that was inspired by tinytemplate. It's a pretty clever setup but it has the disadvantage that the API cannot deal with dynamic templates well.The solution to this problem so far has been the
source
feature which introduces aSource
type which lets the engine lie about lifetimes through the use ofself-cell
andmemo-map
.There is a alternative that could be evaluated where these borrows to the source string are not taking place through the references but string handles. That would likely require significant changes in the lexer and potentially the parser as well, but the changes to the instructions are probably quite limited.
The way this could work is that the instructions that currently hold an
&'source str
would instead hold something like aStrHandle
which looks something like this:Obviously one complexity with this approach is that at no point the handles must be separated from the instructions they are contained in. It also comes at least through the safe interface with additional cost as Rust will require a lot more bounds checks.
unsafe
would work here, but actually making a safe interface where someone does not change the source string underneath the instructions accidentally could be quite tricky.The benefit of doing this would be that the engine ends up with less dependencies, overall less unsafe code and a nicer API.