jtablesaw / tablesaw

Java dataframe and visualization library
https://jtablesaw.github.io/tablesaw/
Apache License 2.0
3.55k stars 642 forks source link

Line breaks are not recognised on a Windows computer #1255

Open tischi opened 6 months ago

tischi commented 6 months ago

Hi,

We have a seemingly Windows specific issue, where the line breaks of the CSV are not recognised.

I am not sure how to debug this efficiently, because I do not have a Windows computer with a development environment.

Is there anything general I could try to make this more robust across OS?

This is my current code, which opens the tables correctly on a Mac:

final InputStream inputStream = IOHelper.getInputStream( path );
// https://jtablesaw.github.io/tablesaw/userguide/importing_data.html
CsvReadOptions.Builder builder = CsvReadOptions.builder( inputStream )
    .separator( separator )
    .missingValueIndicator( "na", "none", "nan" )
    .sample( numSamples > 0 )
    .sampleSize( numSamples )
    .columnTypesPartial( nameToType );
final Table rows = Table.read().usingOptions( builder );

And here is the table that creates the problems on Windows:

test-crop-8bit-ds2.csv

tischi commented 6 months ago

Maybe it helps for the debugging that this file iMGL_AAV2MOI105_6h_DAPI_IBA1_well6_image2_C2.csv does not have this issue.