dmaicher / doctrine-test-bundle

Symfony bundle to isolate your app's doctrine database tests and improve the test performance
MIT License
1.08k stars 61 forks source link

Consistency issues with different connections (eg read/write replicas) #289

Closed Tristan971 closed 5 months ago

Tristan971 commented 7 months ago

The issue is similar to #284, but easier to catch in a more common setup, like so (Symfony 6.4):

doctrine:
  dbal:
    default_connection: default
    connections:
      default:
        url: '%env(resolve:DATABASE_URL)%'
        ...
        replicas:
          ro:
            url: '%env(resolve:DATABASE_RO_URL)%'

In this case, you might end up with invisible writes if they're stuck within a transaction on the default connection, but you're doing a read with the readonly version (default.ro is its name with this config)

Indeed, this yields 2 different static connections, since the hash in https://github.com/dmaicher/doctrine-test-bundle/blob/f10de294e41570d027a301554a609c394d40e669/src/DAMA/DoctrineTestBundle/Doctrine/DBAL/StaticDriver.php#L31 will have different values for each.

eg:

image

image

This isn't surprising, to be fair, and I don't think the project can do much for it, but hopefully this saves someone some time.

The solution being to make sure your tests use a single connection per database (at least within contexts where you expect to see writes done by tests). Unfortunately Symfony doesn't make this easy.

In principle, you could add config/test/doctrine.yaml with:

doctrine:
  dbal:
    connections:
      default:
        use_savepoints: true
        replicas: ~

But in practice, during configuration parsing, Symfony maps ~ to [] and then performs an array merge instead of replacement, and thus your default values with a replica remain active...

So you pretty much have to remove replicas from your default doctrine.yaml and configure it explicitly for all your environments that don't rely on this bundle.

I'm not entirely sure how this can be worked around, to be honest.

About the only way to work around it that I can see is to have the StaticDriver somehow force Doctrine to return a connection to the primary specifically, maybe checking for when $connection is instanceof \Doctrine\DBAL\Connections\PrimaryReadReplicaConnection and calling ensureConnectedToPrimary.

dmaicher commented 7 months ago

Indeed this seems tricky to handle for this bundle :confused:

The StaticDriver is operating on a lower level and does not know anything about PrimaryReadReplicaConnection.

Possibly one could decorate Doctrine\Bundle\DoctrineBundle\ConnectionFactory::connect and call ensureConnectedToPrimary :thinking:

Or we find a way of ensuring the same connection hash is used for primary and the read replica inside StaticDriver.

For now I would also suggest to configure a connection without replicas for the test environment then.

dmaicher commented 6 months ago

@Tristan971 @prohalexey could you maybe test https://github.com/dmaicher/doctrine-test-bundle/compare/connection_key?expand=1 ?

Its some quick proof of concept to allow forcing usage of a specific key for keeping the connection in the static array.

Not sure yet how to expose that (via config maybe) but for this PoC I quickly enabled defining parameters for that.

See https://github.com/dmaicher/doctrine-test-bundle/compare/connection_key?expand=1#diff-2582698d058918d461f16ca8976229516f88622d0e77f46db9e8141394d1f176R29-R31

This also allows to define/overwrite the key that is used per replica.

So for testing this PoC with a config like this

doctrine:
  dbal:
    default_connection: default
    connections:
      default:
        url: '%env(resolve:DATABASE_URL)%'
        ...
        replicas:
          ro:
            url: '%env(resolve:DATABASE_RO_URL)%'

you would need to define Symfony parameters to force using the same key for both

parameters:
    dama.connection_key.default: some_custom_key
    dama.connection_key.default.ro: some_custom_key

WDYT? Would this solve your problem?

prohalexey commented 6 months ago

@dmaicher I think yes, if i could set the same keys for all connections.

BTW json_encode does not garauntee the order of elements in the array, so even if connections have the same connection params, they may produce different sha1 hashes(due to sorting). By default, you have to ksort connection params before json_encode.

dmaicher commented 6 months ago

@prohalexey

BTW json_encode does not garauntee the order of elements in the array,

Do you have a source for that? I tested it and at least on PHP 8.3 this always generates exactly the same json.

prohalexey commented 6 months ago

Do you have a source for that?

The first part is https://www.rfc-editor.org/rfc/rfc7159.html#section-1

An object is an unordered collection of zero or more name/value pairs, where a name is a string and a value is a string, number, boolean, null, object, or array.

The second part is that keys in php array could be in the different order, but with the same values(values loaded from configuration), so it will produce the different string.

dmaicher commented 6 months ago

The first part is https://www.rfc-editor.org/rfc/rfc7159.html#section-1

Ok but that is completely unrelated to the php implementation of json_encode. I'm pretty certain it will deterministically produce the same output string for the same input array.

The second part is that keys in php array could be in the different order, but with the same values(values loaded from configuration), so it will produce the different string.

How should this happen though? Also certain here that the order of keys will not change really. Its build deterministically from the Bundle configuration.

Anyway this is off-topic. I don't see an issue with the implementation. If you encounter any problems with changing hashes then please open a new issue.

dmaicher commented 6 months ago

@Tristan971 @prohalexey do you think it would make sense to always (by default) use the same underlying driver connection for the primary and the read replicas? I can't really think of a scenario where you would need different connections and this would result in having those read inconsistencies due to non-committed transactions... :thinking:

Tristan971 commented 6 months ago

Well, there's a reasonable argument that if someone is testing that something uses the primary/secondary specifically, then this side-effect/"bug" was actually a feature.

But in my opinion that's not very likely, so as long as it's documented in release notes then someone writing tests that are this specific will figure it out.

zilionis commented 5 months ago

I just tested this solution, with connection keys. Looks good for me as well. Waiting for release :)

dmaicher commented 5 months ago

@zilionis what do you think about https://github.com/dmaicher/doctrine-test-bundle/issues/289#issuecomment-2097775046? Any opinion on using the same key by default for primary and read replicas?

dmaicher commented 5 months ago

You can find a WIP PR here that suggests to make this configurable and to use the same key for primary and replicas by default.

Maybe you could give it a try.

So for a DBAL config like

doctrine:
  dbal:
    default_connection: default
    connections:
      default:
        url: '%env(resolve:DATABASE_URL)%'
        ...
        replicas:
          ro:
            url: '%env(resolve:DATABASE_RO_URL)%'

by default the connection key default would be used for the primary and read replica.

If needed this could be changed like

dama_doctrine_test:
  connection_keys:
    default: custom_key #similar to default config this would use "custom_key" as key for primary and replicas
dama_doctrine_test:
  connection_keys:
    default: 
      primary: custom_key_primary
      replicas:
        ro: custom_key_read_only_replica

Does this make sense to you?

zilionis commented 5 months ago

@dmaicher Hey, I tested PR, for me looks good. I am waiting for release :)

dmaicher commented 5 months ago

released: https://github.com/dmaicher/doctrine-test-bundle/releases/tag/v8.1.0

Thanks for the input :+1: