Closed BlackMael closed 8 years ago
There is no way to do this. With a CSV file, there is no difference; it's just an empty field.
If you have control over the export, you could write a special value that means null, and when reading, look for that value and make it null.
Shouldn't a field with no content represent a null? I don't see why field 1 and 2 in my example above is not the perfect example of how a null string is differentiated from an empty string. In a similar manner as indicating embedded quotes and line feeds. Basically for each field parsed, I need to know if the raw field was quoted. Then if the field value is an empty string and it wasn't quoted then it is null. If it was quoted, then it is an empty string. Seems logical to me at least. Using a magic string like "NULL" or the like just seems like a bad idea.
No, a field with no content is just an empty string. Having quotes around a field doesn't mean anything. It just says that there may be things in the field like a quote, newline, or delimiter. Quotes are optional.
Changing this behavior would be changing how the CSV format works. It's a text file and should be treated as such. If a field has special meaning, you will need to determine how to convert the content.
I understand what you're saying, but it doesn't make sense to do based on the CSV format.
This may be a config option that I could incorporate in the future. Something like TreatEmptyFieldsAsNull
. When this is turned on, it would do what you're asking.
Thank you for speedy feedback Josh I am tempted to fork and implement it myself. I started having a nose around the code. But just at this moment I'm strapped for time with project deadlines. Maybe in a week or too I'll have a crack at it. It shouldn't too onerous.
If you want a little more verbose way of doing this inside of the existing framework, you can always create a new String ITypeConverter
that inherits from the default CsvHelper.TypeConversion.StringConverter
, and overrides the ConvertFromString
method as such:
public override object ConvertFromString(CsvHelper.TypeConversion.TypeConverterOptions options, string text)
{
if (string.IsNullOrEmpty(text))
return null;
else
return base.ConvertFromString(options, text);
}
Then explicitly specify that you want to use this overriden type converter for every field in which a null value is acceptable.
This feature would be great to have. I've used other csv readers and have had this behavior previously. I've started using your library recently to replace my old readers so would preferably like to have this null feature available.
Do you have any plans to implement this?
This feature would also help me out also.
I second a "treat empties as nulls" option. This makes records much more manageable with null coalescence, propagation, etc.
I know it's not the quite fix asked for, but you can write a generic string => null handling type converter, or just use a call to ConvertUsing in your class maps. They're both fairly easy to use for such situations, I've found.
On Sat, Nov 14, 2015 at 1:05 PM, Daniel Rasmussen notifications@github.com wrote:
I second a "treat empties as nulls" option. This makes records much more manageable with null coalescence, propagation, etc.
— Reply to this email directly or view it on GitHub https://github.com/JoshClose/CsvHelper/issues/252#issuecomment-156729411 .
Version 3.0 will interpret null
and NULL
as null's and set the property with a null. You are able to change what values are used for null in the configuration.
What will Version 3.0 do if you actually want to store the string "null"?
It will output string.Empty
. If you want to have it output something else, you could create a custom type converter that you use for that field. If it's null, conver to "null"
, otherwise get the default type converter for that type and return the result of that instead.
Having a config for a custom null value when writing would be a nice feature. It would probably just use the first in the list of null values actually. If you want to create a new issue for writing custom values for null, I'll add it to the 3.0 release.
Hi @JoshClose thanks for responding and apologies for the delay on my end.
The context I'm coming from is using CsvHelper to pack up some tables in a database into files which we later unpack into another datastore. Recently, we got burned by the fact that null string values weren't round-tripping: they unpacked as the empty string. In hindsight this should have been obvious given the nature of CSV, but it tripped us up.
Changing null be be stored as null
rather than empty doesn't seem unreasonable, but it still has the problem that not all values will round-trip, since I'm assuming that the string "null"
will be encoded the same as the value null
. Being able to choose a custom null placeholder doesn't solve this problem: there can always be collisions.
The way we solved our specific problem was to introduce a wrapper type NullableString
with a custom type converter such that all non-null values were wrapped with quotes to distinguish them from null values which are encoded as un-quoted null
.
As far a general library behavior, my personal preference would be:
DateTime
, int
, etc. store null as empty. This seems most true to the language-agnostic CSV way of storing missing data (empty cell).null
). Obviously people would still have the option to use more robust methods like the one we did as well.I'd be happy to open a separate issue if you were interested in pursuing this direction, but I understand if you want to stick with the library's existing approach.
Version 3.0 will interpret null and NULL as null's and set the property with a null. You are able to change what values are used for null in the configuration.
Sorry to open such an old issue. But, my NULL string values are coming in as "null" rather than an actual null. Is there a setting I need to have it ? I'm using v 7.1.1?
You can do per field:
Map(m => m.Name).TypeConverterOption.NullValues("null");
Or for a whole type:
csv.Configuration.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add("null");
No, a field with no content is just an empty string. Having quotes around a field doesn't mean anything. It just says that there may be things in the field like a quote, newline, or delimiter. Quotes are optional.
Changing this behavior would be changing how the CSV format works. It's a text file and should be treated as such. If a field has special meaning, you will need to determine how to convert the content.
I understand what you're saying, but it doesn't make sense to do based on the CSV format.
This may be a config option that I could incorporate in the future. Something like
TreatEmptyFieldsAsNull
. When this is turned on, it would do what you're asking.
Hey Josh Did you add this feature?? This is really needed, more people are running into this problem.
@jainrishabh67 just did this with his post above.
csv.Configuration.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add(string.Empty);
(or just 2 double quotes)
This will turn empty strings into null values in your object.
You can set whatever strings you want to be considered null
as @chadthompsonallscripts demonstrated.
I think an empty string should always be treated as null if the destination type isn't string
This setting doesn't seem to have any effect... In my CSV I have a DateTime column that can be null. I did this:
config.TypeConverterOptionsCache.GetOptions<DateTime>().NullValues.Add(string.Empty);
When I try to load the data from a CsvDataReader into a DataTable, I get this error:
String '' was not recognized as a valid DateTime.Couldn't store <> in End Column. Expected type is DateTime.
I believe that's related to this bug. https://github.com/JoshClose/CsvHelper/issues/1680
@thomaslevesque @JoshClose I don't know the internals of CsvHelper, but (conceptually) the error message makes sense to me, given that a DateTime
can never be null
.
I would expect this to work, though (assuming your DTO uses a Nullable<DateTime>
):
config.TypeConverterOptionsCache.GetOptions<DateTime?>().NullValues.Add(string.Empty);
@augustoproiete Yes, you're right. It would need to be DateTime?
for it to work. Using options when auto mapping is broken right now though. About to commit a new package that fixes it.
Fixed in version 23.3.1 in NuGet. https://www.nuget.org/packages/CsvHelper/21.3.1
I believe the "init" values underneath prevent it from being used in older projects that haven't adopted the new language features. The nuget executes, but not able to use the
config.TypeConverterOptionsCache.GetOptions<DateTime?>().NullValues.Add(string.Empty);
as it gives in the compiler:
[CS1545] Property, indexer, or event 'CsvContext.TypeConverterOptionsCache' is not supported by the language; try directly calling accessor methods 'CsvContext.get_TypeConverterOptionsCache()' or 'CsvContext.set_TypeConverterOptionsCache(?)'
Those options in the suggestions also leave a compiler time error. Just as an FYI (I'll be using the workaround using an explicit Converter for each of the String properties.
TypeConverterOptionsCache
is on CsvContext
now and not a part of CsvConfiguration
anymore.
Right, I am using it from the CsvContext:
public RecordMap(CsvContext context)
{
context.TypeConverterOptionsCache.GetOptions<string().NullValues.Add(string.Empty); }
Unfortunately my project is current at C# 7.3, so I'm just passing it on :) and Great work on this! (want to make sure you know how much time it does save)
I think you just need to update Visual Studio because newer MSBuild versions will handle that. See this issue https://github.com/JoshClose/CsvHelper/issues/1645 for details.
Thanks! I'm glad you find it useful.
Awesome, that did the trick!!! Thank you for the assist!
Thanks for the updates @JoshClose!
Unfortunately it's still not working for me. Here's my code:
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = "|",
HasHeaderRecord = true
};
var table = new DataTable();
table.Columns.Add("Id", typeof(int));
table.Columns.Add("Name", typeof(string));
table.Columns.Add("Start", typeof(DateTime));
var endColumn = table.Columns.Add("End", typeof(DateTime));
endColumn.AllowDBNull = true;
using var reader = new StreamReader(@"C:\tmp\test.csv");
using var csvReader = new CsvReader(reader, config);
csvReader.Context.TypeConverterOptionsCache.GetOptions<DateTime?>().NullValues.Add(string.Empty);
using var dataReader = new CsvDataReader(csvReader);
table.Load(dataReader);
I'm still getting ArgumentException
:
String '' was not recognized as a valid DateTime.Couldn't store <> in End Column. Expected type is DateTime.
Am I missing something?
var endColumn = table.Columns.Add("End", typeof(DateTime));
should be
var endColumn = table.Columns.Add("End", typeof(DateTime?));
You're setting the global null options for DateTime?
not DateTime
. DateTime
also can't be null, so anywhere it could be, you should use DateTime?
. If the column in the DB is nullable, so should your type in C#.
var endColumn = table.Columns.Add("End", typeof(DateTime));
should be
var endColumn = table.Columns.Add("End", typeof(DateTime?));
Unfortunately, that doesn't work. DataTable doesn't support nullable types (DataTable dates back to the very beginning of .NET, before nullable value types were even a thing...)
You're setting the global null options for DateTime? not DateTime. DateTime also can't be null, so anywhere it could be, you should use DateTime?. If the column in the DB is nullable, so should your type in C#.
I tried setting the options for DateTime?
, DateTime
, and both. None of it works.
Ok... so this is a strange thing. For this to work, you need to add the null values to type string
because CsvDataReader
only deals in strings when loading a data table. There is no way in the data reader to know what type each of the fields are.
This should work for you:
csvReader.Context.TypeConverterOptionsCache.GetOptions<string>().NullValues.Add(string.Empty);
This isn't very intuitive. I should add something about this to the CsvDataReader
documentation.
The data table only calls int GetValues(object[] values)
.
@JoshClose thanks, it works!
Is this not supposed to work?
csvReader.Context.TypeConverterOptionsCache.GetOptions<ulong?>().NullValues.Add("0");
I still get null
tho.
@dgroh What is your goal? Are you trying to have "0"
come through as null
?
@JoshClose I think other way around. We have a ulong?
which is coming empty in our CSV file. In case of empty, we want it to be replaced by "0" (zero). Because we have several models we can not create a type converter for each model. We want to make it global. It means, everywhere where ulong?
arrives empty should be "0" (zero).
Set a default through an attribute or mapping.
private class Foo
{
public int Id { get; set; }
[Default(0)]
public ulong? Number { get; set; }
}
or
private class FooMap : ClassMap<Foo>
{
public FooMap()
{
Map(m => m.Id);
Map(m => m.Number).Default(0);
}
}
I do extends StringConverter.
// I use assembly .net7.0
using CsvHelper;
using CsvHelper.Configuration;
using CsvHelper.TypeConversion;
using System;
using System.Reflection;
public class UnquotedEmptyValueToNullConverter : StringConverter
{
public override object? ConvertFromString(string? text, IReaderRow row, MemberMapData memberMapData)
{
// CsvParser.Field[] fs = row.Context.Parser.Fields;
// if(fs[memberMapData.Index].Length == 0 ) return null;
var fields = typeof(CsvParser)
.GetField("fields", BindingFlags.NonPublic | BindingFlags.Instance)!
.GetValue(row.Context.Parser)
as Array;
var length = fields!.GetValue(0)!.GetType()
.GetField("Length", BindingFlags.Public | BindingFlags.Instance)!
.GetValue(fields.GetValue(memberMapData.Index));
if( (int)length == 0 ) return null;
return base.ConvertFromString(text, row, memberMapData);
}
}
How To Use at Main.cs
using CsvHelperExtension.TypeConversion;
void Main()
{
using( var reader = new StreamReader("path/to/file.csv") )
using( var csv = new CsvReader(reader, CultureInfo.InvariantCulture) )
{
// Register globally.
csv.Context.TypeConverterCache.AddConverter<UnquotedEmptyValueToNullConverter>(new UnquotedEmptyValueToNullConverter());
// Below example ...
// conv.Context.RegisterClassMap<FooMap>();
// var list = csv.GetRecords<Foo>().ToList();
}
Use as Attributes.cs
public class Foo
{
public int Id { get; set; }
public string Name { get; set; }
[TypeConverter( typeof(UnquotedEmptyValueToNullConverter) )]
public string? Remarks { get; set; }
}
Use as Mapping.cs
public class FooMap : ClassMap<Foo>
{
public FooMap()
{
Map(m => m.Id);
Map(m => m.Name);
Map(m => m.Json).TypeConverter<UnquotedEmptyValueToNullConverter>();
}
}
Good solution @kurokoSin. But when I tried that I get the length of the wrong field. I needed to use memberMapData.Index -1
to get the correct field. V31
thank you for the advice @Piscatore . When I implemented it, it was V30. The memberMapData.Index may have shifted due to the version update.
I am having issues trying to read in CSV documents where a string field can be null.
For example a record looks like this... ABC,,"",DEF,"GHI"
Five string fields should be read as... 0: "ABC" 1: null 2: "" 3: "DEF" 4: "GHI"
But CsvReader reads field 1 as "" but then I no longer can tell if it was originally supposed to be a null or an empty string.
Other value fields seem to handle nulls fine, basically "" is null. But for strings "" is not null. How do I preserve truly null strings?
I need to have string fields with zero length which are not quoted to pass through as null, not an empty string.