CompuMasterGmbH / CompuMaster.Data.XlsEpplus

Read datatables from/write to Microsoft Excel files (XLSX)
Other
0 stars 0 forks source link

DateTime detection, better results on reading data #1

Open JochenHWezel opened 8 years ago

JochenHWezel commented 8 years ago

DateTime detection routine might be improved more culture independent if comparing Range.Value vs. Range.Value2 as described at https://fastexcel.wordpress.com/2011/11/30/text-vs-value-vs-value2-slow-text-and-how-to-avoid-it/

Range.Value

This mostly gets the underlying value from the cell.

But if the cell is formatted as a date or currency then Range.Value converts the underlying value into a VBA date or currency variable before passing it to the VBA variable used in the assignment statement. Since VBA currency only has 4 decimal places this can cause a loss of accuracy. Suppose cell G4 has a value of 123.456789 formatted as currency. Then Range(“g4”).Value will retrieve a value of 123.4568 rather than the actual value! If you assign this to a Variant you get a variant with a subtype of currency, but if you assign it to a Double then the value first gets converted to currency datatype and truncated, and then the currency datatype gets converted to a double.

Maybe using .Value to retrieve cells formatted as dates into a variant is useful because then you can use IsDate() to detect that the cell was a date.

Range.Value is an efficient way of transferring a range of cell values to VBA because you can assign the Range to a variant variable and the you get a variant containing a 2-dimensional array of variants. This is much more efficient that looping on the cells one-by-one.

.Value is (unfortunately) the default property of the Range object.

Range.Value2

This works the same way as Range.Value, except that it does not check the cell format and convert to Date or Currency. And thats probably why its faster than .Value when retrieveing numbers.

jochenwezel commented 8 years ago

It had to be considered at method

 Private Shared Function LookupDotNetType([object] As Object) As VariantType
            If [object] Is Nothing Then
                Return VariantType.Empty
            Else
                Select Case [object].GetType
                    Case GetType(String)
                        Return VariantType.String
                    Case GetType(Double)
                        Return VariantType.Double
                    Case GetType(Boolean)
                        Return VariantType.Boolean
                    Case GetType(DateTime)
                        Return VariantType.Date
                    Case Else
                        Return VariantType.Object
                End Select
            End If
        End Function

The current implementation never comes up to the line of Return VariantType.Date. Unfortunately, Epplus library doesn't provide Value2 at this point to do comparison and datetime detection as suggested.

If Value2 would be provided, the method interface of LookupDotNetType would need a 2nd parameter for the value2 and a little bit of additional coding.

Maybe it's worthy to file a feature request at the Epplus project for supporting detection of datetime and currency cells as described.