Closed Chimrod closed 1 year ago
Hello, do you have any update on this pull request ?
Hi ! Thanks for your review and feedback. I will check all the points and answer them quickly.
@Chimrod I've rebased your commits and finished the PR in #41
Thank you for your contribution!
Here is a pull request for the issue #36.
The request allow to ignore the first bytes in the files if they match a know BOM pattern (actually UTF-8 and UTF-16 are checked). Without this check, thoses bytes are considered as a part of the first row.
The check is done in a pure function
check_bom
, which return the number of bytes to ignore. If we want to going further, we could also return the associated encoding in order to inform the caller of the file encoding (not implemented here). I’ve plugged the call in the functionfill_in_buf_or_Eof
which is where the data are read and stored.I’ve added a test which fails without the code (see the issue for that).