The following script tries to fetch and decode a JSON document, but fails:
#!/usr/bin/env perl
use strict;
use warnings;
use v5.10.0;
use Firefox::Marionette;
use Data::Dumper;
my $fm = Firefox::Marionette->new;
$fm->go("https://eprel.ec.europa.eu/api/products/dishwashers2019/543834");
say Dumper($fm->json);
#my $json = $fm->strip; utf8::encode($json); $json = JSON::XS::decode_json($json); say Dumper($json); # workaround for the problem
__END__
Output is:
malformed UTF-8 character in JSON string, at character offset 193 (before "\x{fffd}","postalCod...") at /opt/perl-5.30.3/lib/site_perl/5.30.3/Firefox/Marionette.pm line 6391.
The problem seems to be that the document content is available in characters, but JSON::XS requires that the input is in octets. So it works well unless there are "wide characters" in the input. Explicitly transforming the characters into utf-8 octets, either with utf8::encode or another function (Encode::str2bytes would also work) fixes the problem, and should probably be built into the json() method.
I've decided to put the encoding into the strip method as it assumes UTF-8 encoding as well. Thanks for all the comments and bug reports. Most appreciated. I'm planning on a new release in a week or so.
The following script tries to fetch and decode a JSON document, but fails:
Output is:
The problem seems to be that the document content is available in characters, but JSON::XS requires that the input is in octets. So it works well unless there are "wide characters" in the input. Explicitly transforming the characters into utf-8 octets, either with
utf8::encode
or another function (Encode::str2bytes
would also work) fixes the problem, and should probably be built into thejson()
method.