CSBiology / FSharpAux

Auxiliary functions and data structures for the F# programming language
MIT License
23 stars 16 forks source link

FSharpAux.IO.SeqIO.Seq.CSV: Incorrect collection formatting #13

Open ZimmerD opened 4 years ago

ZimmerD commented 4 years ago

Description

The FSharpAux.IO.SeqIO.Seq.CSV function with the parameter flatten = false, does not perform as intended.

Repro steps

  1. Define a Record type with a field of type array.
  2. FSharpAux.IO.SeqIO.Seq.CSV function with the parameter flatten = false, using an instance of your type with an array of Length > 20.
  3. Look at the string.

Expected behavior

Nicely formatted output.

Actual behavior

The array is spread accross mutliple lines if it exceeds a certain length. This happens because "sprintf "%A" x" is used to format collections. This introduces line-brakes into the string.

Known workarounds

FSharpAux.IO.SeqIO.Seq.CSV calls FSharpAux.IO.SeqIO.Seq.CSVwith. Simply replace the parameter strFunction with a version that does the formatting correctly.

ZimmerD commented 4 years ago

In this gist you can find a proposed solution.

It fixes the bug but also introduces two little changes:

  1. I emitted the "[|" and "|]" marks at the beginning and ending of e.g. arrays. I think when parsing collections they do no real favor and can therefore be ommited.
  2. In lines 35-38 I assure that the inner separator is different from the outer one. Otherwise selecting ";" as a separator could break the parsing again. We could go for something more explicit but this would change the function signatures.

Both points are of course open for discussion!

kMutagene commented 4 years ago

@Joott , Can you have a lookat this?

caroott commented 4 years ago

The fix for the flattening of the array looks good to me. Where I'm not exactly sure how to handle it is the "backup" separator. Currently there is a list with commonly used separators and the first one that isn't the separator used for writing is selected. As far as i can see it, only the first two separators in this list are able to be selected, so the third one is obsolete. If this is the course we want to take maybe we should pick two separators only. Another option would be to add an optional parameter for the backup separator, but this would require some rewriting of the functions.