Closed antscloud closed 11 months ago
This would be a good addition to this crate, a PR for this feature would be great! I have not made a comprehensive parser, but the following snippet might be of help:
/// Parses CF time + duration strings (seconds/minutes/hours since ISODATE)
// TODO: return Result instead
fn get_time_from_str(timestr: &str) -> Option<(DateTime<offset::FixedOffset>, Duration)> {
use nom::{
branch::alt,
bytes::complete::{tag, take, take_till},
character::complete::{digit1, one_of},
combinator::{all_consuming, map, map_opt, opt},
number::complete::double,
sequence::{pair, separated_pair, terminated, tuple},
IResult,
};
fn duration(input: &str) -> IResult<&str, Duration> {
let till_space = take_till(|c| c == ' ');
let dur = map_opt(till_space, |t: &str| match t {
"days" | "day" | "d" => Some(Duration::days(1)),
"hours" | "hour" | "h" => Some(Duration::hours(1)),
"minutes" | "minute" | "min" => Some(Duration::minutes(1)),
"seconds" | "second" | "sec" | "s" => Some(Duration::seconds(1)),
_ => None,
});
let since = tag(" since ");
terminated(dur, since)(input)
}
fn ymd_hms(input: &str) -> IResult<&str, chrono::NaiveDateTime> {
fn u32_num(input: &str) -> IResult<&str, u32> {
map_opt(digit1, |s: &str| s.parse::<u32>().ok())(input)
}
fn ymd(input: &str) -> IResult<&str, chrono::NaiveDate> {
let i32_parser = map_opt(digit1, |s: &str| s.parse::<i32>().ok());
map_opt(
tuple((i32_parser, tag("-"), u32_num, tag("-"), u32_num)),
|(y, _, m, _, d)| chrono::NaiveDate::from_ymd_opt(y, m, d),
)(input)
}
fn hms(input: &str) -> IResult<&str, chrono::NaiveTime> {
map_opt(
tuple((u32_num, tag(":"), u32_num, tag(":"), double)),
|(hour, _, minute, _, second)| {
chrono::NaiveTime::from_hms_nano_opt(
hour,
minute,
second.trunc() as _,
(second.fract() * 1_000_000_000.0) as _,
)
},
)(input)
}
map(tuple((ymd, tag(" "), hms)), |(ymd, _, hms)| {
chrono::NaiveDateTime::new(ymd, hms)
})(input)
}
fn timezone(input: &str) -> IResult<&str, chrono::offset::FixedOffset> {
fn twonum(input: &str) -> IResult<&str, i32> {
map_opt(take(2usize), |s: &str| s.parse::<i32>().ok())(input)
}
let quad = pair(twonum, twonum);
let colon_sep = separated_pair(
map_opt(digit1, |x: &str| x.parse::<i32>().ok()),
tag(":"),
twonum,
);
let tz = map(
tuple((tag(" "), opt(one_of("+-")), alt((quad, colon_sep)))),
|(_, pm, (tz_h, tz_m))| {
if let Some('-') = pm {
-(tz_h * 3600 + tz_m)
} else {
tz_h * 3600 + tz_m
}
},
);
map(opt(tz), |tz| {
chrono::offset::FixedOffset::east(tz.unwrap_or(0))
})(input)
}
fn parse_line(
input: &str,
) -> IResult<&str, (Duration, chrono::DateTime<chrono::offset::FixedOffset>)> {
use chrono::offset::TimeZone;
map_opt(tuple((duration, ymd_hms, timezone)), |(dur, time, tz)| {
let tz: chrono::offset::FixedOffset = chrono::TimeZone::from_offset(&tz);
let time = tz.from_local_datetime(&time);
match time.single() {
Some(x) => Some((dur, x)),
_ => None,
}
})(input)
}
let mut parser = all_consuming(parse_line);
parser(timestr).ok().map(|(_, x)| (x.1, x.0))
}
Thank you for your snippet, it'll help :+1:
I am new to Rust, I'll try to write something when I have time. If I do, I'll do a PR
I was thinking,maybe it might be easier, in the first place, to add a binding to either the Python package (as written in CPython) or the C UDunits package, what do you think?
Wrapping python is not trivial. Udunits might be feasible, but this is a big library and could be a pain to use compared to rollig our own parser. The following is how we could implement the parser if the iso8601
crate could expose the nom parser
pub enum Duration {
Days,
Hours,
Minutes,
Seconds,
}
fn duration(input: &str) -> IResult<&str, Duration> {
let days = map(alt((tag("days"), tag("day"), tag("d"))), |_: &str| {
Duration::Days
});
let hours = map(alt((tag("hours"), tag("hour"), tag("h"))), |_: &str| {
Duration::Hours
});
let minutes = map(
alt((tag("minutes"), tag("minute"), tag("min"))),
|_: &str| Duration::Minutes,
);
let seconds = map(
alt((tag("seconds"), tag("second"), tag("sec"), tag("s"))),
|_: &str| Duration::Seconds,
);
alt((days, hours, minutes, seconds))(input)
}
fn cf_parser(
input: &str,
) -> IResult<&str, (Duration, DateTime)> {
let since = tuple((space1, tag("since"), space1));
all_consuming(separated_pair(duration, since, iso8601::datetime))(input)
}
Comparing times is however the hard part. This depends on calendars and might be quite a lot of complexity. Not sure how much we need to leave to the user there
Your snippets work great for parsing !
The major difficulty will be the handling of the different calendars (in addition to comparing them)
Since both chrono and time crate use proleptic gregorian calendar, it seems that we can't use them to handle the calendars.
For example with the all_leap
calendar, we can't define the 29 of february 2022 otherwise the functions panic.
Even if we found a workaroud, one still need to reimplement some traits like the Add trait for this specific case
Maybe something like this :
pub struct Date {
year: u32,
month: i8,
day: i8,
}
pub struct Time {
hour: i8,
minute: i8,
second: i8,
}
pub struct DateTime {
date: Date,
time: Time,
offset: time::Duration,
}
pub struct DatetimeNoLeap {
datetime: DateTime,
}
impl Add for DatetimeNoLeap {
fn add(&self, other: DatetimeNoLeap) -> DatetimeNoLeap {
// Implementation
}
}
pub struct Datetime360Days {
datetime: DateTime,
}
pub struct DatetimeJulian {
datetime: DateTime,
}
pub struct CFDatetime {
from: DateTime,
duration: time::Duration,
}
We will have to define the calendars ourselves with the correct impl for Add<Duration>
and friends. This might be a lot of work, is there a MWP we could strive for?
Definitely feasible, but not easy I'm sorry, a MWP ?
Le lun. 18 juil. 2022, 13:34, Magnus Ulimoen @.***> a écrit :
We will have to define the calendars ourselves with the correct impl for Add
and friends. This might be a lot of work, is there a MWP we could strive for? — Reply to this email directly, view it on GitHub https://github.com/georust/netcdf/issues/94#issuecomment-1187156575, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANZ3GA26FJJVBQZLK7B3R5DVUU6ODANCNFSM5XVNXQUA . You are receiving this because you authored the thread.Message ID: @.***>
Sorry, MWP should have been MVP, minimum viable product. Would be great to see how the user could interpret a time array in a file using CF-conventions to get the correct times
Thank you :+1:
I started to try the snippet with the parser and i just came with a working little piece of code to convert an array of int to datetimes. I wrote this with the julia api in mind, i.e. with high levels api functions decode_cftime and encode_cf_time. There is some code that are unused because i realized that this was not possible for chrono nor time to handle other calendars
use chrono::*;
use nom::{
branch::alt,
bytes::complete::{tag, take, take_till},
character::complete::{digit1, one_of},
combinator::{all_consuming, map, map_opt, opt},
number::complete::double,
sequence::{pair, separated_pair, terminated, tuple},
IResult,
};
fn get_time_from_str(timestr: &str) -> Option<(DateTime<chrono::FixedOffset>, Duration)> {
fn duration(input: &str) -> IResult<&str, Duration> {
let till_space = take_till(|c| c == ' ');
let dur = map_opt(till_space, |t: &str| match t {
"days" | "day" | "d" => Some(Duration::days(1)),
"hours" | "hour" | "h" => Some(Duration::hours(1)),
"minutes" | "minute" | "min" => Some(Duration::minutes(1)),
"seconds" | "second" | "sec" | "s" => Some(Duration::seconds(1)),
_ => None,
});
let since = tag(" since ");
terminated(dur, since)(input)
}
fn ymd_hms(input: &str) -> IResult<&str, chrono::NaiveDateTime> {
fn u32_num(input: &str) -> IResult<&str, u32> {
map_opt(digit1, |s: &str| s.parse::<u32>().ok())(input)
}
fn ymd(input: &str) -> IResult<&str, chrono::NaiveDate> {
let i32_parser = map_opt(digit1, |s: &str| s.parse::<i32>().ok());
let result = map_opt(
tuple((i32_parser, tag("-"), u32_num, tag("-"), u32_num)),
|(y, _, m, _, d)| chrono::NaiveDate::from_ymd_opt(y, m, d),
)(input);
result
}
fn hms(input: &str) -> IResult<&str, chrono::NaiveTime> {
let result = map_opt(
tuple((u32_num, tag(":"), u32_num, tag(":"), double)),
|(hour, _, minute, _, second)| {
chrono::NaiveTime::from_hms_nano_opt(
hour,
minute,
second.trunc() as _,
(second.fract() * 1_000_000_000.0) as _,
)
},
)(input);
result
}
map(tuple((ymd, tag(" "), hms)), |(ymd, _, hms)| {
chrono::NaiveDateTime::new(ymd, hms)
})(input)
}
fn timezone(input: &str) -> IResult<&str, chrono::FixedOffset> {
fn twonum(input: &str) -> IResult<&str, i32> {
map_opt(take(2usize), |s: &str| s.parse::<i32>().ok())(input)
}
let quad = pair(twonum, twonum);
let colon_sep = separated_pair(
map_opt(digit1, |x: &str| x.parse::<i32>().ok()),
tag(":"),
twonum,
);
let tz = map(
tuple((tag(" "), opt(one_of("+-")), alt((quad, colon_sep)))),
|(_, pm, (tz_h, tz_m))| {
if let Some('-') = pm {
-(tz_h * 3600 + tz_m)
} else {
tz_h * 3600 + tz_m
}
},
);
map(opt(tz), |tz| chrono::FixedOffset::east(tz.unwrap_or(0)))(input)
}
fn parse_line(input: &str) -> IResult<&str, (Duration, chrono::DateTime<chrono::FixedOffset>)> {
map_opt(tuple((duration, ymd_hms, timezone)), |(dur, time, tz)| {
let tz: chrono::FixedOffset = chrono::TimeZone::from_offset(&tz);
let time = tz.from_local_datetime(&time);
match time.single() {
Some(x) => Some((dur, x)),
_ => None,
}
})(input)
}
let mut parser = all_consuming(parse_line);
parser(timestr).ok().map(|(_, x)| (x.1, x.0))
}
fn dispatch_calendar(calendar: &str) -> Calendars {
match calendar {
"standard" | "gregorian" => Calendars::CalendarStandard,
"proleptic_gregorian" => Calendars::CalendarProlepticGregorian,
"360_day" => Calendars::Calendar360Day,
"julian" => Calendars::CalendarJulian,
"no_leap" => Calendars::CalendarNoLeap,
"365_day" => Calendars::Calendar365Day,
"all_leap" => Calendars::CalendarAllLeap,
"366_day" => Calendars::Calendar366Day,
_ => Calendars::CalendarStandard,
}
}
enum Calendars {
CalendarStandard,
CalendarProlepticGregorian,
Calendar360Day,
CalendarJulian,
CalendarNoLeap,
Calendar365Day,
CalendarAllLeap,
Calendar366Day,
}
struct CFUnitsDateTime {
from: DateTime<chrono::FixedOffset>,
duration: chrono::Duration,
calendar: Calendars,
}
trait CFDateTimeEncoder {
fn encode(self: &Self, datetime: CFUnitsDateTime);
}
trait CFDateTimeDecoder {
fn decode(self: &Self, value: i32) -> DateTime<FixedOffset>;
}
impl CFDateTimeDecoder for CFUnitsDateTime {
fn decode(self: &Self, value: i32) -> DateTime<FixedOffset> {
let ms: f64 = (self.duration.num_milliseconds() as f64) * (value as f64);
let datetime = self.from + chrono::Duration::milliseconds(ms as i64);
datetime
}
}
fn decode_cftime(
input_str: &str,
time_values: Vec<i32>,
calendar: Calendars,
) -> Vec<DateTime<FixedOffset>> {
let (date, dur) = get_time_from_str(input_str).unwrap();
let cfdatetime = CFUnitsDateTime {
from: date,
duration: dur,
calendar: Calendars::CalendarStandard,
};
time_values
.into_iter()
.map(|v| cfdatetime.decode(v))
.collect()
}
fn main() {
decode_cftime(
"days since 1900-01-01 00:00:00",
(1..10_000).collect(),
Calendars::CalendarStandard,
);
}
I guess we need to implement all the datetime structures for the different calendars (DatetimeNoLeap, Datetime360Day and so on) with the basic traits (Add, Sub ..).
We may need three traits on top of the datetimes structures :
And then we could provide simple functions to convert from array of number to array of datetime and inversely
What do you think ?
I am starting to like this design. Maybe we can start creating a PR for this and iterate some designs? Could make it easier to do a code review. I refined the snippet I wrote above, adding tests and removing chrono
in the process:
Wow such a snippet in no time :exploding_head:
I can make a PR with this code if you want. What about create a library (subfolder) with cargo new --lib cftime
in the root in case the code becomes too big and need a separated crate ?
I had it laying around, but never got to doing anything with it :D Feel free to create a new crate and make a PR. This makes sense as a separate crate as the utilities are orthogonal to reading and writing netCDF
An implementation of cftime can be found in this crate: https://github.com/antscloud/cftime-rs
Hello, thank you for the work on this project !
By any chance, is there a plan to add a CF time attribute reading/parsing to handle the datetime type ? I can't find anything on the docs nor on crates.io I think of something like Julia CFtime or Python CFtime
I believe it would be a great feature for the geophysical field