atk4 / api

Implementation of RestAPI for Agile Data
https://agiletoolkit.org/
14 stars 12 forks source link

error: charset malformed JSON... this happen with columns fields from db with specials characters... any fix? #8

Open rapgithub opened 5 years ago

rapgithub commented 5 years ago

I have a table with a column text with French and Spanish characters and the JSON returned from the API gives me error

Fatal error: Uncaught InvalidArgumentException: Unable to encode data to JSON in Zend\Diactoros\Response\JsonResponse: Malformed UTF-8 characters, possibly incorrectly encoded in /Applications/MAMP/htdocs/www/cpanel_tripsbro/vendor/zendframework/zend-diactoros/src/Response/JsonResponse.php:164 Stack trace: #0 

I think the issue it is the charset problem... my table is utf-8 charset in the db and header and html charset is ok but the issue it is from the API returning malformed Json when the column has accents in the field to show...

is there a way to set charset when using the line below? $db = new \atk4\data\Persistence_SQL connection here?

or where?

thanks

rapgithub commented 5 years ago

I found a line that seems to solve this issue of Malformed JSON when the field contains special UTF-8 chars... if I change the Persistence_SQL.php and add the line below it fix the issue for the API JSON but for the Model edit in ATK4 it creates the problem with these fields showing wrong encoding...

change to Persistence_SQL.php added this line to fix the API but it affects the Model...

        case 'text':
            $v = utf8_encode($v);
            break;

Anyone knows how to fix it in the API only without affecting the ATK4 Model edit mode for those fields? thanks

DarkSide666 commented 5 years ago

Can you please try adding | JSON_UNESCAPED_UNICODE constant in these two places?

Please report back if this helps you and I will include this fix in repo code then.

rapgithub commented 5 years ago

First of all I installed the API using composer and checking the link you sent me it is a different API.php file than the one I installed via composer.. mine was having 368 lines...

I have added too these line JSON_UNESCAPED_UNICODE you mentioned me before and it did not work either... I tried with your version of API.php and added the JSON_UNESCAPED_UNICODE and still the problem persist...

Error again the same:

Fatal error: Uncaught InvalidArgumentException: Unable to encode data to JSON in Zend\Diactoros\Response\JsonResponse: Malformed UTF-8 characters, possibly incorrectly encoded in /vendor/zendframework/zend-diactoros/src/Response/JsonResponse.php:164 Stack trace: #0 

can you test it first with text fields in my case those text fields are:

$this->addField('description_en', ['type' => 'text']); $this->addField('description_it', ['type' => 'text']); $this->addField('description_fr', ['type' => 'text']); $this->addField('description_de', ['type' => 'text']);

I have English not problem, I have italian, French and German and I have the problem.. the table that has these columns is collation utf8_general_ci and every field is defined with collation utf8_general_ci try to create a text column in db and enter French German or italian characters you will see the problem then...

In ATK4 in edit mode the field are rendered OK and the character are well saved in the db... showing perfect but in the API the JSON gives this error...

I put back also the Persistence_SQL.php to its normal code taking out the lines I have added and the problem it is the same:

Fatal error: Uncaught InvalidArgumentException: Unable to encode data to JSON in Zend\Diactoros\Response\JsonResponse: Malformed UTF-8 characters, possibly incorrectly encoded in /vendor/zendframework/zend-diactoros/src/Response/JsonResponse.php:164 Stack trace: #0 

adding this | JSON_UNESCAPED_UNICODE in those lines:

https://github.com/atk4/api/blob/develop/src/Api.php#L240 https://github.com/atk4/api/blob/develop/src/Api.php#L429

does not fix the problem yet... thanks

rapgithub commented 5 years ago

can you verify if your Persistence_SQL.php it is like mine here?

 public function _typecastLoadField(Field $f, $value)
    {
        // LOB fields return resource stream
        if (is_resource($value)) {
            $value = stream_get_contents($value);
        }

        // work only on copied value not real one !!!
        $v = is_object($value) ? clone $value : $value;

        switch ($f->type) {
        case 'integer':
            $v = (int) $v;
            break;
        case 'float':
            $v = (float) $v;
            break;
        case 'money':
            $v = round($v, 4);
            break;
        case 'boolean':
            if (isset($f->enum) && is_array($f->enum)) {
                if (isset($f->enum[0]) && $v == $f->enum[0]) {
                    $v = false;
                } elseif (isset($f->enum[1]) && $v == $f->enum[1]) {
                    $v = true;
                } else {
                    $v = null;
                }
            } else {
                $v = (bool) $v;
            }
            break;
        case 'date':
        case 'datetime':
        case 'time':
            $dt_class = isset($f->dateTimeClass) ? $f->dateTimeClass : 'DateTime';
            $tz_class = isset($f->dateTimeZoneClass) ? $f->dateTimeZoneClass : 'DateTimeZone';

            if (is_numeric($v)) {
                $v = new $dt_class('@'.$v);
            } elseif (is_string($v)) {
                // ! symbol in date format is essential here to remove time part of DateTime - don't remove, this is not a bug
                $format = ['date' => '+!Y-m-d', 'datetime' => '+!Y-m-d H:i:s', 'time' => '+!H:i:s'];
                $format = $f->persist_format ?: $format[$f->type];

                // datetime only - set from persisting timezone
                if ($f->type == 'datetime' && isset($f->persist_timezone)) {
                    $v = $dt_class::createFromFormat($format, $v, new $tz_class($f->persist_timezone));
                    if ($v === false) {
                        throw new Exception(['Incorrectly formatted datetime', 'format' => $format, 'value' => $value, 'field' => $f]);
                    }
                    $v->setTimeZone(new $tz_class(date_default_timezone_get()));
                } else {
                    $v = $dt_class::createFromFormat($format, $v);
                    if ($v === false) {
                        throw new Exception(['Incorrectly formatted date/time', 'format' => $format, 'value' => $value, 'field' => $f]);
                    }
                }

                // need to cast here because DateTime::createFromFormat returns DateTime object not $dt_class
                // this is what Carbon::instance(DateTime $dt) method does for example
                if ($dt_class != 'DateTime') {
                    $v = new $dt_class($v->format('Y-m-d H:i:s.u'), $v->getTimeZone());
                }
            }
            break;
        case 'array':
            // don't decode if we already use some kind of serialization
            $v = $f->serialize ? $v : json_decode($v, true);
            break;
        case 'object':
            // don't decode if we already use some kind of serialization
            $v = $f->serialize ? $v : json_decode($v, false);
            break;
        }

        return $v;
    }

I added those lines I mentioned firstly because without them the problem exists at least for the API trying to create the JSON.

I do not understand why 'TEXT' type does not exist in the type definition inside _typecastLoadField function

this works for the API if I add it but then stop working will in ATK4 in edit mode when editing this text fields with special characters...

        case 'text':
            $d = mb_detect_encoding($v);
            if ($d=='ASCII') { $v=utf8_encode($v); } else { $v=utf8_encode($v); };
            if ($d=='UTF-8') { $v=$v; };
            break;

or this too

        case 'text':
       $v=utf8_encode($v);
            break;

but then in ATK4 modifying the Persistence_SQL.php in those lines adding this lines of code above in edit mode shows then strange symbols and this is logical because in ATK4 the fields are will rendered all the time but in the API those fields does not render will if the have special characters...

so until now I do not find still the correct solution when using the API with ATK4...

DarkSide666 commented 5 years ago

OK I will make a testcase on my local machine and test this better in next few days and get back to you.

Btw, how your composer.json looks like? Should be "atk4/api": "dev-develop", and same for atk4/data. I am testing on develop versions. Also sometimes composer have a bug and not update. Then you can try to delete your vendor folder and then run composer update once again. Then it should pull last commit from develop branches.

rapgithub commented 5 years ago

thanks for having the time to test it! I wait...

for the compose.json I have this on the root of ATK4/composer.json folder: I do not see why atk4/data is not listed there ...

{ "minimum-stability": "dev", "prefer-stable": false, "require":{ "atk4/ui": "^1.2", "atk4/api": "^0.1.0", "atk4/mastercrud": "^1.0", "atk4/login": "dev-master", "atk4/filestore": "dev-master" }

}

DarkSide666 commented 5 years ago

So here is fully featured test case - no problem found. Please try this out and see if it works for you.

<?php
require_once 'vendor/autoload.php';

// Config
$config = [
    'db' => 'mysql://root@localhost/sampledb;charset=utf8',
    'uri' => '.'.$_SERVER['SCRIPT_NAME'],
];
/*
// No need to add character set and collation because these are default settings anyway
CREATE TABLE test_utf8 (
    id int unsigned not null auto_increment,
    desc_en varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    desc_it varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    desc_fr varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    desc_de varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

this is white cat
questo è gatto bianco
c'est chat blanc
Das ist weiße Katze
*/

// Initialize App
$app = new \atk4\ui\App(['API UTF8 test', 'always_run' => false]); // always_run=false to not render App on shutdown
$app->initLayout('Centered');
$app->init();

// Connect DB
$db = $app->dbConnect($config['db']);

// Header
$app->add(['Header', 'subHeader'=>'https://github.com/atk4/api/issues/8'])
    ->set('Testing utf8');

// Model
class TestModel extends \atk4\data\Model {
    public $table = 'test_utf8';
    public function init() {
        parent::init();
        $this->getElement($this->id_field)->ui['visible'] = true; // to see it in CRUD
        $this->addField('desc_en', ['type' => 'string']);
        $this->addField('desc_it', ['type' => 'string']);
        $this->addField('desc_fr', ['type' => 'string']);
        $this->addField('desc_de', ['type' => 'string']);
    }
}
$m = new TestModel($db);
$m->tryLoadAny();

// CRUD test - works perfectly
$c = $app->add('CRUD');
$c->setModel($m);

// API test
$api = $app->add(new \atk4\api\Api());
$api->rest('/test_utf8', new TestModel($db));

/*
    By default GET converts utf8 entities to \u notation
    See https://stackoverflow.com/a/16498304/1466341

    http://localhost/utf8.php/test_utf8/1 returns
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
    If you add JSON_UNESCAPED_UNICODE in Api->successResponse, then it will not convert characters to \u notation,
    but French character ' still will be converted to \u0027 because of this contant JSON_HEX_APOS we have in api.
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo è gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist weiße Katze"
}
*/

$endpoint = $config['uri'].'/test_utf8/';
$c->addModalAction('GET', 'GET request', function($p, $id) use ($m, $endpoint){
    $p->add('Console')->set(function ($console) use ($id, $m, $endpoint, $id) {

        // Make GET request
        $console->debug('Making GET request... ');
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => $endpoint.$id,
        ]);
        $result = curl_exec($curl);
        curl_close($curl);

        // Response
        $console->debug('Response:');
        $console->output($result);

        // Decode JSON
        $console->debug('Decoded JSON:');
        $data = json_decode($result, true);
        $console->output(var_export($data, true));
        $before = $data;

        // Set values in model
        $console->debug('Set received values in model:');
        $m->set($data);
        $fields = $m->get();
        unset($fields['field_sql']); // new bug in atk4/data? what the hell is field_sql ???
        $console->output(var_export($fields, true));

        // POST request
        $console->debug('Making POST request...');
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => $endpoint.$id,
            CURLOPT_POST => 1,
            CURLOPT_POSTFIELDS => $fields,
        ]);
        $result = curl_exec($curl);
        curl_close($curl);

        // Response
        $console->debug('Response:');
        $console->output($result);

        // Make GET request again
        $console->debug('Making GET request again...');
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => $endpoint.$id,
        ]);
        $result = curl_exec($curl);
        curl_close($curl);

        // Response
        $console->debug('Response:');
        $console->output($result);

        // Decode JSON
        $console->debug('Decoded JSON:');
        $data = json_decode($result, true);
        $console->output(var_export($data, true));
        $after = $data;

        // Compare old and new values
        if ($before == $after) {
            $console->info('Old and new values are equal');
        } else {
            $console->error('Old and new values are different');
        }
    });
});

// Render App
// If API URL was called, then we don't go here and App will not be rendered becauseof always_run=false
$app->run();

Output:

Making GET request... 
Response:
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
Decoded JSON:
array (
  'id' => '1',
  'desc_en' => 'this is white cat',
  'desc_it' => 'questo è gatto bianco',
  'desc_fr' => 'c\'est chat blanc',
  'desc_de' => 'Das ist weiße Katze',
)
Set received values in model:
array (
  'id' => '1',
  'desc_en' => 'this is white cat',
  'desc_it' => 'questo è gatto bianco',
  'desc_fr' => 'c\'est chat blanc',
  'desc_de' => 'Das ist weiße Katze',
)
Making POST request...
Response:
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
Making GET request again...
Response:
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
Decoded JSON:
array (
  'id' => '1',
  'desc_en' => 'this is white cat',
  'desc_it' => 'questo è gatto bianco',
  'desc_fr' => 'c\'est chat blanc',
  'desc_de' => 'Das ist weiße Katze',
)

Old and new values are equal