Open baeseongsu opened 3 weeks ago
Note that the previous (customized) datetime function for the sqlglot package, which was used in the arXiv (v2) paper:
def datetime_fn(arg, fmt=None):
"""
Custom implementation of datetime function.
"""
if isinstance(arg, str):
arg = datetime.datetime.fromisoformat(arg)
if fmt is None:
return arg
if fmt == "start of month":
return datetime.datetime(arg.year, arg.month, 1)
elif fmt == "start of year":
return datetime.datetime(arg.year, 1, 1)
elif fmt == "start of day":
return datetime.datetime(arg.year, arg.month, arg.day)
else:
number, unit = fmt.split(" ")
if unit == "day":
return arg + datetime.timedelta(days=int(number))
elif unit == "month":
return arg + datetime.timedelta(days=int(number) * 30)
elif unit == "year":
return arg + datetime.timedelta(days=int(number) * 365)
else:
raise NotImplementedError(f"Unsupported unit '{unit}'.")
Note that in SQLite3, the strftime
function has different format specifiers, including %J
and %j
, which differ only in capitalization. The %J
specifier is used to return the Julian day number as a floating-point value. For example, strftime('%J', '2024-08-27')
might return something like 2461076.5
, representing the Julian day number for that date.
On the other hand, %j
returns the day of the year as a three-digit number, ranging from 001
to 366
(taking leap years into account). Therefore, to accurately compute the time gap between two different dates, you should use the uppercase format specifier %J
.
There is a potential error in the code below when the month is 12:
https://github.com/baeseongsu/ehrxqa/blob/724bff13a9a2e430ecd54f32e3ef3789bb7fcdb3/experiment/NeuralSQL/executor/sqlglot/executor/env.py#L238-L242
Especially, this part can evoke an error:
The corrected version should be: