JoshClose / CsvHelper

Library to help reading and writing CSV files
http://joshclose.github.io/CsvHelper/
Other
4.76k stars 1.07k forks source link

Consider only exposing necessary properties from the parser and serialiser interfaces. #596

Closed christophano closed 7 years ago

christophano commented 8 years ago

As we discussed on gitter - could we consider a way to only require those properties that are used by the CsvReader/CsvWriter internals as part of the ICsvParser/ICsvSerializer interfaces?

Example from version 2.x - ICsvParser requires you to implement the properties CharPosition and BytePosition. These aren't valid for my ExcelParser, so I simply return -1 without any issues. However, v3 has introduced the TextReader property, which is less simple to work around. Returning null or throwing a NotImplementedException are possibilities, but I'd rather we could avoid that.

My suggestion would be to continue exposing properties from the individual parsers and make them accessible to the api user by exposing them via a type argument in the CsvReader/CsvWriter classes.

For example:

public class CsvReader<TParser> where TParser : ICsvParser
{
    public CsvReader(TParser parser)
    {
        Parser = parser;
    }
    public TParser Parser { get; }
    // rest of the implementation
}

which can be used like this:

using (var reader = new CsvReader<ExcelParser>(new ExcelParser("path/to/workbook")))
{
    // access a property specific to ExcelParser
    var workbook = reader.Parser.Workbook;
}

and for ease of use from the core library:

public class CsvReader : CsvReader<CsvParser>
{
    public CsvReader(CsvParser parser) : base(parser) { }
}

which will continue to be used as it is now:

using (var reader = new CsvReader(new CsvParser(File.Open("path\to\file"))))
{
    // access property specific to CsvParser
    var textReader = reader.Parser.TextReader;
}

What are your thoughts?

christophano commented 7 years ago

Sorry, I haven't updated this issue. Here's an example: ICsvParser is used (primarily) by the CsvReader class. The only members CsvReader needs ICsvParser to have is int Row { get; } and string[] Read();; which leaves TextReader TextReader { get; }, ICsvParserConfiguration Configuration { get; }, long CharPosition { get; }, long BytePosition { get; }, int RawRow { get; } and string RawRecord { get; } as specific to the implementation. Perhaps they could be left to the implementation, or moved to some sort of intermediate interface?

Something like;

public interface IParser : IDisposable
{
    int Row { get; }

    string[] Read();
}

public interface ICsvParser : IParser
{
    TextReader TextReader { get; }

    ICsvParserConfiguration Configuration { get; }

    long CharPosition { get; }

    long BytePosition { get; }

    int RawRow { get; }

    string RawRecord { get; }
}

then I could have;

public interface IExcelParser : Parser
{
    XLWorkbook Workbook { get; }

    int FieldCount { get; }
}

which would allow us to each expose the properties we want, while not having to implement any that are unnecessary.

Obviously this would be quite a bit of work, but I'm happy to start it on a branch, if anyone agrees it is worthwhile.

JoshClose commented 7 years ago

I just did some major changes. I moved out all the state information into a context object that is shared. I think this will make your goal a lot easier. You'll have to let me know how you want to approach this now.

public interface ICsvReader : ICsvReaderRow, IDisposable
{
    ICsvParser Parser { get; }

    bool ReadHeader();

    bool Read();

    IEnumerable<T> GetRecords<T>();

    IEnumerable<object> GetRecords( Type type );
}

public interface ICsvParser : IDisposable
{
    IParserContext Context { get; }

    ICsvParserConfiguration Configuration { get; }

    IFieldReader FieldReader { get; }

    string[] Read();
}

public interface IFieldReader : IDisposable
{
    IFieldReaderContext Context { get; }

    int GetChar();

    string GetField();

    void AppendField();

    void SetFieldStart( int offset = 0 );

    void SetFieldEnd( int offset = 0 );

    void SetRawRecordEnd( int offset );
}
JoshClose commented 7 years ago

I just renamed the interfaces to remove the "Csv" from them.

JoshClose commented 7 years ago

Let me know if everything looks good to you now.

christophano commented 7 years ago

Sorry I've not responded until now. I'm taking a look at this now. While I prefered my implementation (of course!) this looks like it may cover things.

JoshClose commented 7 years ago

I had to make a lot of changes because of other reasons. ;)

JoshClose commented 7 years ago

If you have any other suggestions on changes I should make regarding this, let me know. I want to finalize the breaking changes so I can release 3.0.

christophano commented 7 years ago

Looking at the new changes (and I've only started re-implementing the parser so far), the new context interfaces still require me to implement a lot of csv specific items. There are more interfaces now, but in some cases (e.g. IParserContext and IReaderContext see #762) the interfaces must be implemented by the same object.

Note: I'm more than happy to work with this, but I am worried about potential problems whenever I have to return null or a default value for a property that I can't use or don't need.