Open JaseRandall opened 3 days ago
Ok, been doing more testing and realised I could probably just use:
var value = reader.GetValue(i);
var dateTime = DateTime.Parse(value.ToString());
var timeSpan = new TimeSpan(dateTime.Ticks);
However, I have now found a different issue with this code: (Actually, probably the original issue that led me down a rabbit hole.)
static internal bool TryGetDate(ExcelFormat fmt, double value, DateMode mode, out DateTime dt)
{
dt = DateTime.MinValue;
DateTime epoch = Epoch1904;
// Excel doesn't render negative values as dates.
if (value < 0.0)
return false;
if (mode == DateMode.Mode1900)
{
epoch = Epoch1900;
if (value < 61d)
{
if (value < 1)
{
if (fmt.Kind == FormatKind.Time)
{
dt = DateTime.MinValue.AddDays(value);
return true;
}
// 0 is rendered as 1900-1-0, which is nonsense.
// negative values render as "###"
// so we won't support accessing such values.
return false;
}
if (value >= 60d)
{
// 1900 wasn't a leapyear, but Excel thinks it was
// values in this range are in-expressible as .NET dates
// Excel renders it as 1900-2-29 (not a real day)
return false;
}
value += 1;
}
}
dt = epoch.AddDays(value);
return true;
}
When reading a value out of Excel such as 2:58:48 with cell formating of [h]:mm:ss, it gets converted to a DateTime value of 1/01/0001 2:58:48 AM but when converting a value of 720:00:00 with a cell format of [h]:mm:ss, it gets converted to 30/01/1900 12:00:00 AM.
It's because it passes through epoch.AddDays(value);
without looking at the fmt to see that it's a [h]:mm:ss (Time) format and therefore should be converted via this code: dt = DateTime.MinValue.AddDays(value);
which would correctly convert it to: 31/01/0001 12:00:00 AM which is correctly converted into a TimeSpan of 720 hours.
If you want a timespan, then you should ask for one, and use GetTimeSpan(int ordinal)
instead of GetValue(int ordinal)
. GetValue
uses the provided schema to determine what type of value to return, and with the default schema everything is treated as a string. I think you're right that there could be an improvement with GetValue: it should return a timespan formatted value when the format of the numeric is identified as a time value. However, I think you should just call GetTimeSpan, or provide a schema if you know the "shape" of the data ahead of time.
I think it makes sense to fix this behavior. When accessing a value formatted as a time of day, it should be returning a string that represents the timespan. I've implemented this fix to be included in 0.4.16. Please try the prerelease build and verify that it works the way you expected it to: https://www.nuget.org/packages/Sylvan.Data.Excel/0.4.26-b0001
Thanks! but it's not yet quite how I would like it.
Correctly returning as 00:17:43
Incorrectly returning as 1900-01-02T00:17:43
If I didn't know the schema in advance, I would prefer to receive values that reflect the display value rather than a DateTime value that would appear if the data had of been formatted as a DateTime format.
For some context, my use case is to read an excel file into a DataTable then display in a DataGrid. I understand there might be a question of why not just open it in Excel to view but Excel has quirks that result in it modifying data so my preference is a custom data viewer but that relies on values being read in a manner that is an accurate representation of the values.
(The quirks in Excel include the automatic conversion of large numbers to scientific notation, truncates number type values with more than 15 digits to 15 digits and in some cases changes dd/mm/yyyy format values to mm/dd/yyyy values with no ability to change back via display formatting.)
I have found an issue and a potential solution:
I have an excel worksheet with a value of 0, formatted as Custom "[h]:mm:ss" displaying "0:00:00". When I use the code below, the value is returned as "0001-01-01", which is obviously not the desired value.
Within ExcelDataReader, you have:
which leads to:
So, for case FormatKind.Time, what we really need to do is:
As can be seen, we will also need to add a
string ParsingFormat { get; set; }
property to the ExcelFormat class and overload the constructors like:we would also need to modify the following code in the static constructor of ExcelFormat:
So, the question becomes, is there a workaround that you can suggest without modifying the ExcelFormat code or would the proposed solution fit and therefore should I work out and write some tests and submit a pull request?