Apparent significant performance degradation in v1.5.9 vs v1.5.8

josephschorr commented 1 year ago

SpiceDB (https://github.com/authzed/spicedb) makes use of the emulator for our Spanner datastore tests. We attempted to move to emulator v1.5.9 to enable testing of our use of changefeeds, but ran into immediate CI timeouts after ~10m.

After some investigation, it appears that v1.5.9 of the emulator is quite a bit slower than v1.5.8 for transactions containing somewhat large (~100s) of row insertions; I suspect this is due to the recent code necessary for book keeping for emulation of change feeds.

For reference, a test against v1.5.8 that executes a transaction with 256 rows inserted:

PASS | SpannerDatastore/TestSimple/256 (4.98s)

vs against v1.5.9:

PASS | SpannerDatastore/TestSimple/256 (35.51s)

I've found that the source of the slowdown is the first insertion transaction, which takes ~30s on average to add the 256 rows.

This is not blocking as we have moved back to v1.5.8 and can always increase the CI timeout, but we figured we'd mention it in case it becomes an issue for others

For reference, our Spanner-calling code can be found at https://github.com/authzed/spicedb/tree/main/internal/datastore/spanner

ChangyuLi28 commented 1 year ago

Hi Joseph, thanks for reporting the issue! We were wondering if there was any change stream in the schema when you observed the slowdown?

josephschorr commented 1 year ago

Yes, on the initial runs. I disabled it on following runs to see if it impacted the performance, and didn't see a major change, but I didn't test thoroughly. If you like, I can try that

ChangyuLi28 commented 1 year ago

Thank you for the timely reply. By "disabling" the change stream in the following runs, could you specify if you created and then dropped the change stream from the schema or you didn't create a change stream? And there was still no significant performance gain after that?

josephschorr commented 1 year ago

@ChangyuLi28 I simply didn't create the change stream at all. I'll run another test now to see if I can reproduce

josephschorr commented 1 year ago

Okay, I was incorrect: with change feed never instantiated, it is only ~1.5x slower. I must have deleted it in the previous test

Results:

v1.5.8
    --- PASS: TestSpannerDatastore/TestSimple (4.75s)
        --- PASS: TestSpannerDatastore/TestSimple/1 (0.11s)
        --- PASS: TestSpannerDatastore/TestSimple/2 (0.12s)
        --- PASS: TestSpannerDatastore/TestSimple/4 (0.18s)
        --- PASS: TestSpannerDatastore/TestSimple/32 (0.37s)
        --- PASS: TestSpannerDatastore/TestSimple/256 (3.97s)

v1.5.9 with change feed created:
    --- PASS: TestSpannerDatastore/TestSimple (34.58s)
        --- PASS: TestSpannerDatastore/TestSimple/1 (0.12s)
        --- PASS: TestSpannerDatastore/TestSimple/2 (0.11s)
        --- PASS: TestSpannerDatastore/TestSimple/4 (0.16s)
        --- PASS: TestSpannerDatastore/TestSimple/32 (0.58s)
        --- PASS: TestSpannerDatastore/TestSimple/256 (33.61s)

v1.5.9 with no change feed:
    --- PASS: TestSpannerDatastore/TestSimple (6.58s)
        --- PASS: TestSpannerDatastore/TestSimple/1 (0.13s)
        --- PASS: TestSpannerDatastore/TestSimple/2 (0.13s)
        --- PASS: TestSpannerDatastore/TestSimple/4 (0.20s)
        --- PASS: TestSpannerDatastore/TestSimple/32 (0.69s)
        --- PASS: TestSpannerDatastore/TestSimple/256 (5.43s)

ChangyuLi28 commented 1 year ago

Thanks for the additional information! We are looking into this issue. Another question to help us narrow things down is when you have change streams in schema, what was the size of the data in the insertion transaction?

josephschorr commented 1 year ago

@ChangyuLi28 In the above test that takes ~30s, we are inserting at least 256 rows in a single transaction, like so:

("test/resource","resource0","reader","test/user","user0","...")
("test/resource","resource1","reader","test/user","user1","...")
("test/resource","resource2","reader","test/user","user2","...")
("test/resource","resource3","reader","test/user","user3","...")
("test/resource","resource4","reader","test/user","user4","...")
("test/resource","resource5","reader","test/user","user5","...")
("test/resource","resource6","reader","test/user","user6","...")
("test/resource","resource7","reader","test/user","user7","...")
... additional similar rows ...

If you'd like to be able to reproduce yourself, checkout the SpiceDB repository and run this command from the internal/datastore/spanner directory:

go test ./... -tags ci,docker -run TestSpannerDatastore/TestSimple -v

ChangyuLi28 commented 1 year ago

Hi Joseph, an improvement has been released in the most recent version. Thank you for identifying this issue and providing all the context! Let us know if there's any other issues.

josephschorr commented 1 year ago

Thanks @ChangyuLi28! Seems to work well now

GoogleCloudPlatform / cloud-spanner-emulator

Apparent significant performance degradation in v1.5.9 vs v1.5.8 #128