Chris00 / ocaml-csv

CSV library for OCaml
Other
136 stars 22 forks source link

Bom #37

Closed Chimrod closed 1 year ago

Chimrod commented 2 years ago

Here is a pull request for the issue #36.

The request allow to ignore the first bytes in the files if they match a know BOM pattern (actually UTF-8 and UTF-16 are checked). Without this check, thoses bytes are considered as a part of the first row.

The check is done in a pure function check_bom, which return the number of bytes to ignore. If we want to going further, we could also return the associated encoding in order to inform the caller of the file encoding (not implemented here). I’ve plugged the call in the function fill_in_buf_or_Eof which is where the data are read and stored.

I’ve added a test which fails without the code (see the issue for that).

Chimrod commented 2 years ago

Hello, do you have any update on this pull request ?

Chimrod commented 2 years ago

Hi ! Thanks for your review and feedback. I will check all the points and answer them quickly.

SGrondin commented 1 year ago

@Chimrod I've rebased your commits and finished the PR in #41

Thank you for your contribution!