staabm / phpstan-dba

PHPStan based SQL static analysis and type inference for the database access layer
https://staabm.github.io/archive.html#phpstan-dba
MIT License
258 stars 17 forks source link

phpstan-dba not picking up array shape on doctrine/dbal result #394

Closed Brammm closed 1 year ago

Brammm commented 2 years ago

I ran into a PHPStan error on the following code: https://phpstan.org/r/0a244b72-7cea-4746-8d69-26b17871fd62

After installing this package, I was still getting the same error, but noticed the cache file also wasn't being generated, so I figured something must not be getting recognized. After poking around a little bit, I changed my fetchData method to use the connection to immediately fetch:

private function fetchData(AccessTokenEntityInterface $accessTokenEntity): array
{
    $data = $this->connection->fetchAssociative(
        <<<SQL
        SELECT
            u.id,
            u.name,
            u.email,
            u.timezone,
            u.language,
            u.company_id
        FROM user u
        WHERE u.email = :email
        SQL,
        [
            'email' => 'john@example.com', // in actual code, this is calling a method on $accessTokenEntity
        ]
    );

    if (! $data) {
        throw new UserNotFoundException();
    }

    return $data;
}

Now I did notice the cache file got generated, but I'm still getting the original error.

<?php return array (
  'schemaVersion' => 'v9-put-null-when-valid',
  'schemaHash' => '56eb246db59f605bc0f4aa5c9242e8bb',
  'records' => 
  array (
    'SELECT
    u.id,
    u.name,
    u.email,
    u.timezone,
    u.language,
    u.company_id
FROM user u
WHERE u.email = \'1970-01-01\'' => 
    array (
      'error' => NULL,
    ),
  ),
  'runtimeConfig' => 
  array (
    'errorMode' => 'default',
    'debugMode' => false,
    'stringifyTypes' => false,
  ),
);

I'm guessing this package does not get triggered when fetching from the Statement result?

Secondly, am I doing something wrong that my DB isn't getting properly analyzed? I find it a little strange that a date is being used as the example email.

staabm commented 2 years ago

thanks for the report.

since you are running phpstan-dba with a custom api it seems (not PDO, Doctrine, mysqli or similar), you need to configure the rules for your API, see https://github.com/staabm/phpstan-dba/blob/main/docs/rules.md

Secondly, am I doing something wrong that my DB isn't getting properly analyzed? I find it a little strange that a date is being used as the example email.

how is your database schema looking for the table involved?

Brammm commented 2 years ago

Just to confirm, I am using doctrine/dbal.

This is my bootstrap file:

<?php

use staabm\PHPStanDba\DbSchema\SchemaHasherMysql;
use staabm\PHPStanDba\QueryReflection\PdoMysqlQueryReflector;
use staabm\PHPStanDba\QueryReflection\RuntimeConfiguration;
use staabm\PHPStanDba\QueryReflection\QueryReflection;
use staabm\PHPStanDba\QueryReflection\ReplayAndRecordingQueryReflector;
use staabm\PHPStanDba\QueryReflection\ReflectionCache;

require_once __DIR__ . '/vendor/autoload.php';

$params = parse_url(getenv('DB_URL_LOGIN'));
$dsn    = sprintf('mysql:dbname=%s;host=%s', ltrim($params['path'], '/'), $params['host']);
$pdo    = new PDO($dsn, $params['user'], $params['pass']);

QueryReflection::setupReflector(
    new ReplayAndRecordingQueryReflector(
        ReflectionCache::create(
            __DIR__.'/.phpstan-dba.cache'
        ),
        new PdoMysqlQueryReflector($pdo),
        new SchemaHasherMysql($pdo)
    ),
    new RuntimeConfiguration()
);

The connection property in the earlier code example is a doctrine/dbal connection.

DB schema is

CREATE TABLE `user` (
  `id` char(36) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL COMMENT '(DC2Type:uuid)',
  `company_id` char(36) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '(DC2Type:uuid)',
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `email` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `timezone` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `language` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `require_sso` tinyint(1) NOT NULL,
  `require_two_factor` tinyint(1) NOT NULL,
  `two_factor_code` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `password` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `password_never_expires` tinyint(1) NOT NULL,
  `password_edited_on` datetime DEFAULT NULL COMMENT '(DC2Type:datetime_immutable)',
  `last_login` datetime DEFAULT NULL COMMENT '(DC2Type:datetime_immutable)',
  `super_admin` tinyint(1) NOT NULL,
  `disabled_at` datetime DEFAULT NULL COMMENT '(DC2Type:datetime_immutable)',
  `deleted_at` datetime DEFAULT NULL COMMENT '(DC2Type:datetime_immutable)',
  `active_session` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `UNIQ_8D93D649E7927C74` (`email`),
  KEY `IDX_8D93D649979B1AD6` (`company_id`),
  CONSTRAINT `FK_8D93D649979B1AD6` FOREIGN KEY (`company_id`) REFERENCES `company` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

I got recommended this package by @ondrejmirtes, but maybe I was expecting something it's not meant to do. My main issue was that I was code hinting the shape of the array returned by fetchData:

    /**
     * @return array{
     *     id: string,
     *     name: string,
     *     email:string,
     *     timezone: string,
     *     language: string,
     *     company_id: string,
     *     roles: string
     * }
     */
    private function fetchData(AccessTokenEntityInterface $accessTokenEntity): array;

My hope was that this package would infer that shape from the SQL query and get rid of the "nah, $data ain't that shape, it's non-empty-array<string, mixed>" (which is the typehint from fetchAssociative).

staabm commented 2 years ago

My hope was that this package would infer that shape from the SQL query and get rid of the "nah, $data ain't that shape, it's non-empty-array<string, mixed>" (which is the typehint from fetchAssociative).

thats exactly the use-case and should work for doctrine DBAL and mysql.

you could try enabling debugMode to see whether there are some other problems

when your objects are properly typed to the doctrine/dbal interfaces it should work. see this unit test for example. fetchAssociative is also supported.

maybe you can bootstrap a small example reproducing repo, so I can have a look what is missing exactly in your case.

staabm commented 2 years ago

just did a bit more research... even runnig executeQuery from a Statement should work, see unit tests on that case.

if you can provide a repository with reproducible minimal failling test-case I would love to fiddle it out.

noemi-salaun commented 2 years ago

Hello, I get the same error with PHPStan 1.8.8 and phpstan-dba 0.2.48.

/**
 * @return array{id: int, name: string}
 */
private function fetchPlayer(): array
{
    $player = $this->connection->fetchAssociative('SELECT id, name FROM players WHERE id = ?', [1]);
    if (!$player) {
        throw new \RuntimeException('Player not found');
    }
    return $player;
}

Method MyClass::fetchPlayer() should return array{id: int, name: string} but returns non-empty-array<string, mixed>.

See cached result below ```php [ 'SELECT id, name FROM players WHERE id = \'1\'' => [ 'result' => [ 5 => PHPStan\Type\Constant\ConstantArrayType::__set_state( [ 'keyType' => PHPStan\Type\UnionType::__set_state( [ 'sortedTypes' => true, 'types' => [ 0 => PHPStan\Type\Constant\ConstantIntegerType::__set_state( [ 'value' => 0, ] ), 1 => PHPStan\Type\Constant\ConstantIntegerType::__set_state( [ 'value' => 1, ] ), 2 => PHPStan\Type\Constant\ConstantStringType::__set_state( [ 'objectType' => null, 'value' => 'id', 'isClassString' => false, ] ), 3 => PHPStan\Type\Constant\ConstantStringType::__set_state( [ 'objectType' => null, 'value' => 'name', 'isClassString' => false, ] ), ], ] ), 'itemType' => PHPStan\Type\UnionType::__set_state( [ 'sortedTypes' => false, 'types' => [ 0 => PHPStan\Type\IntegerRangeType::__set_state( [ 'min' => 0, 'max' => 4294967295, ] ), 1 => PHPStan\Type\StringType::__set_state( [ ] ), ], ] ), 'allArrays' => null, 'nextAutoIndexes' => [ 0 => 2, ], 'keyTypes' => [ 0 => PHPStan\Type\Constant\ConstantStringType::__set_state([ 'objectType' => null, 'value' => 'id', 'isClassString' => false, ]), 1 => PHPStan\Type\Constant\ConstantIntegerType::__set_state([ 'value' => 0, ]), 2 => PHPStan\Type\Constant\ConstantStringType::__set_state([ 'objectType' => null, 'value' => 'name', 'isClassString' => false, ]), 3 => PHPStan\Type\Constant\ConstantIntegerType::__set_state([ 'value' => 1, ]), ], 'valueTypes' => [ 0 => PHPStan\Type\IntegerRangeType::__set_state([ 'min' => 0, 'max' => 4294967295, ]), 1 => PHPStan\Type\IntegerRangeType::__set_state([ 'min' => 0, 'max' => 4294967295, ]), 2 => PHPStan\Type\StringType::__set_state([]), 3 => PHPStan\Type\StringType::__set_state([]), ], 'optionalKeys' => [], ] ), ], ], ]; ```
staabm commented 2 years ago

thank you. if it is reproducible, could you provide a failling test-case?

noemi-salaun commented 2 years ago

I'm not sure I fully understand how it works. Depending on in what class I put my query, sometime I got all the informations in the cache (like the result I put in my previous comment) and sometime it only contains

[
    'SELECT * FROM conditions WHERE id = \'1\'' => 
    array (
      'error' => NULL,
    )
]

Is it because my table does not contains data for ID 1, so the result is empty?

noemi-salaun commented 2 years ago

In some case, I think the wrong placeholder is used like here

 if (null !== $sequence['conditionId']) {
            $condition = $this->dbal->fetchAssociative(
                'SELECT * FROM conditions WHERE id = ?',
                [$sequence['conditionId']]
            );
}

In the cache, I get

    'SELECT * FROM conditions WHERE id = NULL' => 
    array (
      'error' => NULL,
    ),

but its not accurate, conditionId cannot be null inside the if

staabm commented 2 years ago

whats exact the type of $sequence in your example before the IF? you can use \PHPStan\dumpType($sequence); to make phpstan report it

whats exact the type of $condition after fetchAssociative?

staabm commented 2 years ago

@noemi-salaun

I tried reproducing your case in https://github.com/staabm/phpstan-dba/pull/446

maybe you can fiddle with it, until it looks more like your real world case (and actually fails).

staabm commented 2 years ago

@Brammm inspired by PR #446 could you try to send a reproducing PR for your initial problem?

noemi-salaun commented 2 years ago

@staabm

$sequence is the result of a previous query with the column id being an unsigned int autoincremented primary key and the column conditionId being a nullable unsigned int

public function cloneSequence(int $sequenceId): int
{
    $sequence = $this->dbal->fetchAssociative('SELECT * FROM dialog_sequences WHERE id = ?', [$sequenceId]);
    // ...
}

Dumped type: array<string, mixed>

I tried narrowing the type of the parameter. It seems that the mixed type causes the issue.

    private function foo(mixed $conditionId): void
    {
        if (null !== $conditionId) {
            $condition = $this->dbal->fetchAssociative(
                'SELECT id, argument3 FROM conditions WHERE id = ?',
                [$conditionId]
            );
        }
        // ...
   }

In the phpstan-dba.cache I got

<? [
    'SELECT id, argument3 FROM conditions WHERE id = NULL' => 
    array (
      'error' => NULL,
    ),
]
staabm commented 2 years ago

even with mixed its not reproducible for me: https://github.com/staabm/phpstan-dba/pull/446

could you try to build a repo with a minimal reproducer?

noemi-salaun commented 2 years ago

I try editing your test case, but it's marked as skipped by phpunit and I cant find what is causing the skip You can try on your side with this change :

    public function bug394(mixed $conditionId)
    {
        if ($conditionId !== null) {
            $query = 'SELECT email, adaid FROM ada WHERE adaid = ?';
            $fetchResult = $this->conn->fetchAssociative($query, [$conditionId]);
            assertType('array{email: string, adaid: int<-32768, 32767>}|false', $fetchResult);
        }
    }
noemi-salaun commented 1 year ago

@staabm Here you go https://github.com/noemi-salaun/phpstan-dba-bug-394

The issue seems to come from ReplayAndRecordingQueryReflector. The cache generated is wrong with this reflector. The cache generated with RecordingQueryReflector is ok.

BONUS : You can switch the type of the parameter $id of method fetch1 or fetch2 from int to mixed and see the cache being like this (notice the id = NULL instead of id = '1') :

'SELECT id, name, \'request2\' FROM my_table WHERE id = NULL'
noemi-salaun commented 1 year ago

In ReplayAndRecordingQueryReflector::getResultType() the process never go into the catch(CacheNotPopulatedException) because the ReplayQueryReflector::getResultType() checks if the cache has the result before trying to access it. We can see this comment // queries with errors don't have a cached result type. But in my case, the cache does not have my result, not because of error, but because its the first run and the cache does not exist yet.

I wanted to make a PR, but I can't get XDebug to work with PHPStan. I tried with --xdebug and everything, but it does nothing at all. The breakpoints works in the bootstrap, but do not works inside PHPStan analyse() method. I can't even var_dump variables. All I managed to do is file_put_content to succeed in having an output :cry: So with this very limited power, I'd rather let you do the fix :rofl:

staabm commented 1 year ago

thx for the investigation. I am fine with dropping ReflectionCache->hasResultType as long as we have a unit test for the problem you are describing

staabm commented 1 year ago

I think this one was fixed with recent releases. can you confirm?

noemi-salaun commented 1 year ago

I just tried it and it looks good

staabm commented 1 year ago

thx