LibreCat / Catmandu

Catmandu - a data processing toolkit
https://librecat.org
176 stars 31 forks source link

Catmandu::Importer::YAML fails on UTF8: Catmandu::Importer encoding broken #46

Closed nichtich closed 9 years ago

nichtich commented 10 years ago

With YAML::XS, which is preferred by YAML::Any, the YAML importer fails when importing UTF-8 YAML files:

$ echo "umlaut: Ü" | catmandu convert YAML
YAML::XS::Load Error: The problem:

    invalid trailing UTF-8 octet

was found at document: 0

This behaviour of YAML::XS is documented (see https://rt.cpan.org/Public/Bug/Display.html?id=54683) and won't be changed. I suppose it can be fixed by adding an encoding parameter, but this is not documented and by a bug in Catmandu::Importer or Catmandu::App::convert the parameter is not passed to the file handle anyway. This should work (it does when hard-coding "raw" in Catmandu::Importer!), but it does not:

$ echo "umlaut: Ü" | catmandu convert YAML --encoding :raw

Even when it worked, the defaul!t encoding setting (:utf8) is annoying at least for YAML. I have not tested with JSON and I won't invest more work in fixes that don't get released anyway :-(. .

nichtich commented 10 years ago

Part of the problem has already been issued half a year ago: https://github.com/LibreCat/Catmandu/issues/24

phochste commented 10 years ago

Available in the dev branch to be released soon

nichtich commented 10 years ago

Still (or again?) broken in Catmandu version 0.9204. Please don't close unless covered by unit test and release at CPAN.

vpeil commented 9 years ago

Fixed in release 0.9205.