chronotope / chrono

Date and time library for Rust
Other
3.22k stars 513 forks source link

Owned DelayedFormat struct #76

Closed 3Hren closed 7 years ago

3Hren commented 8 years ago

Hi!

Sometimes it's required to precompile some strftime pattern known at runtime to be repeatedly consumed in the future in conjunction with timestamp to produce formatted strings.

I see there is DelayedFormat, but it requires lifetime specifier in its Iterator type which makes it (seems like) impossible or quite tricky to save somewhere for further usage.

It would be great if you think about some kind of owned DelayedFormat struct or point me how to adapt current struct for this case.

lifthrasiir commented 8 years ago

DelayedFormat is a temporary object and should not be stored. A recommended way to store the items statically is via a static array of Item<'static>:

    pub fn to_rfc2822(&self) -> String {
        const ITEMS: &'static [Item<'static>] = &[Item::Fixed(Fixed::RFC2822)];
        self.format_with_items(ITEMS.iter().cloned()).to_string()
    }

(Incidentally, this is the actual implementation of DateTime::to_rfc2822.)

3Hren commented 8 years ago

Yes, I see that in the docs. Your example is for static strings, however I need to precompile format and store at runtime. Since we can't store buffer and compiled DelayedFormat in the same struct:

struct Foo {
    pattern: String, // <---------- The 'a is here or the whole struct, no matter, it isn't possible to compile in safe land.
    format: DelayedFormat<'a>, // 'a 
}

The only way is to implement it manually.

For example:

// No lifetime, can put it in the box of trait with no `+'a`.
trait Bar {}

struct Foo {
    pattern: DelayedFormatBuf,
}

impl Foo {
  fn new(patern: String) -> Foo {
    Foo {
      pattern: DelayedFormatBuf::compile(&pattern),
    } 
  }
}

impl Bar for Foo {}

It is useful when you need to keep traits in the Box without lifetime specifier.

lifthrasiir commented 8 years ago

@3Hren Oh, that makes sense. It would be solvable by make Item::{Literal, Space} accept a Cow<'a, str> instead of &'a str. I still doubt the usefulness of it, though, since the parsing of format string is rarely the slowest part of formatting (pluggable Item is mainly provided for allowing different types of format strings). Do you have a significant performance degradation from not having owned Items?

3Hren commented 8 years ago

Well, the following benchmarks shows that there isn't such difference as I thought:

#![feature(test)]

extern crate test;
extern crate chrono;

use std::io::Write;
use test::Bencher;
use chrono::UTC;

#[bench]
fn interpreter(b: &mut Bencher) {
    let now = UTC::now();
    let mut buf = Vec::with_capacity(128);

    let pattern = "%Y-%m-%d %H:%M:%S.%.6f".to_owned();
    b.iter(|| {
        write!(&mut buf, "{}", now.format(&pattern)).unwrap();
        buf.clear();
    });
}

#[bench]
fn compiler(b: &mut Bencher) {
    let now = UTC::now();
    let mut buf = Vec::with_capacity(128);

    let pattern = "%Y-%m-%d %H:%M:%S.%.6f".to_owned();
    let format = now.format(&pattern);

    b.iter(|| {
        write!(&mut buf, "{}", format).unwrap();
        buf.clear();
    });
}
test compiler    ... bench:         843 ns/iter (+/- 58)
test interpreter ... bench:         986 ns/iter (+/- 338)

This is strange, because my manual implementation for C++ gives:

datetime.strftime                   918 ns/iter
datetime.manual[real]               66 ns/iter

I need to investigate this behavior more... You're right, making this struct owned doesn't gives significant performance boost. You are free to close this issue, thanks!

3Hren commented 8 years ago

Ah, I see here that you reparse the given pattern each time iteration occurs, which seems unnecessary, because the specified pattern is immutable and parsing can be performed only once to generate tokens.

Something like this:

pattern(&'a str) -> &[Token<'a>].
enum Token<'a> {
    Literal(&str),
    YearYYYYPlaceholder, 
    YearYYPlaceholder,
    ...
    FractionalSecondsPlaceholder(usize),
}