JoshClose / CsvHelper

Library to help reading and writing CSV files
http://joshclose.github.io/CsvHelper/
Other
4.71k stars 1.06k forks source link

.TypeConverterOption.NullValues not composable with TypeConverter<T> #1314

Open jzabroski opened 5 years ago

jzabroski commented 5 years ago
 Map(m => m.FeeInPercent)
                .Name("Loan Origination Fee (in percent)")
                .TypeConverterOption.NullValues("NA", " NA", "NA ", " NA ")
                .TypeConverter<PercentSignDecimalTypeConverter>();

This doesn't work because the moment you use a TypeConverter, you cant apply TypeConverterOption stuff and have it work.

If this isn't planned to be supported, the builder syntax should disallow this as a logical sequence of build events.

JoshClose commented 4 years ago

The reason this happens is because the type converter is the place that those null values are looked at. Those values you set in NullValues is passed into the type converter. Value types don't look at them at all because they can't be null. I believe the only two converters that are for reference types are NullableConverter and StringConverter. Both of them have this code in it:

foreach( var nullValue in memberMapData.TypeConverterOptions.NullValues )
{
    if( text == nullValue )
    {
        return null;
    }
}

It might be more appropriate to put a virtual protected method on DefaultConverter so a custom converter can just use that.

Do you think that would be a good solution for this?

jzabroski commented 4 years ago

@JoshClose I am not clear on what the pseudo-code for that solution would look like. Also, there is no DefaultConverter symbol anywhere in CsvHelper, so I'm assuming you're referrring to DefaultTypeConverter, but the methods on that class are already virtual, so I'm not sure what you're proposing.

I also am not convinced you're hitting on the root problem here. Here are some principles I think that may serve the usability of this library very well:

  1. NullValues should only concern itself with the range of values that should be regarded as null.
  2. .TypeConverter<T>() and .TypeConverterOptions.NullValues(params string[] nullValues) should be composable.
  3. NullValues has nothing to do with the target type. It has everything to do with the source type. It should therefore always operate before the concrete TypeConverter is called.
  4. If you really want to do it your way, consider forking NullValues into NullInputValues and NullOutputValues. This eliminates the combinatorial explosion in lines of code a developer has to write to massage CSV data into a null format, and makes the massaging logic more discoverable by putting it in the ClassMap (I love ClassMap - such a brilliant idea and not sure if you're aware, but you've inspired me and other authors, such as the author of OpenSpreadsheet, to do the same kind of fluent interface.)

This is desireable for several reasons:

  1. NullValues participates in TypeConverterOptionsCache
  2. The fluent syntax should ideally read as a sentence: "First transform null values, then, with the result, apply my type converter." Similar to the T-SQL idiom: SET @ShortDate = CAST(NULLIF(dateValue, '1/1/1900') AS DATE); - This idiom first determines if dateValue "is null", and then does the type conversion.
JoshClose commented 4 years ago

What I mean is adding this method to the DefaultTypeConverter.

protected virtual bool IsNull(string text, MemberMapData memberMapData)
{
    if (string.IsNullOrEmpty(text))
    {
        return true;
    }

    foreach (var nullValue in memberMapData.TypeConverterOptions.NullValues)
    {
        if (text == nullValue)
        {
            return true;
        }
    }

    return false;
}

Then an implementation could just call IsNull(text, memberMapData) if they wanted to check for null.

I see your point though. Having the null check happen before the type converter makes sense. Does the null value then get passed into the type converter? I think it would have to. This might get you into the weird situation of passing a null into a value type and it not being able to convert it. The error message would say the text is null instead of "NA" then.

JoshClose commented 4 years ago

Maybe the DefaultTypeConverter could check for null and return default(memberMapData.Member.MemberType()) too.

jzabroski commented 4 years ago

The more I think about it, I think you should re-write the way DefaultTypeConverter works. Here is how I would approach it if I never knew CsvHelper existed, based on my experience years ago writing WPF Data Binding ValueConverters.

public class UniversalTypeConverter : ITypeConverter
{
    /// <summary>
    /// Converts the string to an object.
    /// </summary>
    /// <param name="text">The string to convert to an object.</param>
    /// <param name="row">The <see cref="IReaderRow"/> for the current record.</param>
    /// <param name="memberMapData">The <see cref="MemberMapData"/> for the member being created.</param>
    /// <returns>The object created from the string.</returns>
    public virtual object ConvertFromString(string text, IReaderRow row, MemberMapData memberMapData)
    {
        var targetType = memberMapData.Member?.MemberType();
        // obtain the conveter for the target type
        TypeConverter converter = TypeDescriptor.GetConverter(targetType);

        try
        {
            // determine if the supplied value is of a suitable type
            if (converter.CanConvertFrom(typeof(string))
            {
                // return the converted value
                return converter.ConvertFrom(text);
            }
            else
            {
                // throw the exception returned by the TypeConverter
                throw converter.GetConvertFromException(text);
            }
        }
        catch (Exception ex)
        {
            var message =
            $"The conversion cannot be performed.\r\n" +
            $"    Text: '{text}'\r\n" +
            $"    MemberType: {memberMapData.Member?.MemberType().FullName}\r\n" +
            $"    TypeConverter: '{memberMapData.TypeConverter?.GetType().FullName}'";
            throw new TypeConverterException(this, memberMapData, text, (ReadingContext)row.Context, message);
        }
    }

    /// <summary>
    /// Converts the object to a string.
    /// </summary>
    /// <param name="value">The object to convert to a string.</param>
    /// <param name="row">The <see cref="IWriterRow"/> for the current record.</param>
    /// <param name="memberMapData">The <see cref="MemberMapData"/> for the member being written.</param>
    /// <returns>The string representation of the object.</returns>
    public virtual string ConvertToString(object value, IWriterRow row, MemberMapData memberMapData)
    {
        if (value == null)
        {
            return string.Empty;
        }

        if (value is IFormattable formattable)
        {
            var format = memberMapData.TypeConverterOptions.Formats?.FirstOrDefault();
            return formattable.ToString(format, memberMapData.TypeConverterOptions.CultureInfo);
        }

        return value.ToString();
    }
}
JoshClose commented 4 years ago

Are you suggesting the UniversalTypeConverter replace most of the existing converters or just having TypeDescriptor used as a backup?

JoshClose commented 4 years ago

What is converter.GetConvertFromException(text)? I'm not seeing that on System.ComponentModel.TypeDescriptor. Is it an extension method that I would write?

jzabroski commented 4 years ago

https://docs.microsoft.com/en-us/dotnet/api/system.componentmodel.typeconverter.getconvertfromexception?view=netcore-3.0

jzabroski commented 4 years ago

I'll follow up in more detail when I'm not busy. I think UniversalTypeConverter can probably replace the defaultTypeConverter. Not sure on best way to let people opt-in without breaking them - not thinking at that level yet.

JoshClose commented 4 years ago

There are things that would be missing like formats, number styles, etc. I don't see a way to use those when doing a System.ComponentModel.TypeConverter.