Closed josephschorr closed 1 year ago
Hi Joseph, thanks for reporting the issue! We were wondering if there was any change stream in the schema when you observed the slowdown?
Yes, on the initial runs. I disabled it on following runs to see if it impacted the performance, and didn't see a major change, but I didn't test thoroughly. If you like, I can try that
Thank you for the timely reply. By "disabling" the change stream in the following runs, could you specify if you created and then dropped the change stream from the schema or you didn't create a change stream? And there was still no significant performance gain after that?
@ChangyuLi28 I simply didn't create the change stream at all. I'll run another test now to see if I can reproduce
Okay, I was incorrect: with change feed never instantiated, it is only ~1.5x slower. I must have deleted it in the previous test
Results:
v1.5.8
--- PASS: TestSpannerDatastore/TestSimple (4.75s)
--- PASS: TestSpannerDatastore/TestSimple/1 (0.11s)
--- PASS: TestSpannerDatastore/TestSimple/2 (0.12s)
--- PASS: TestSpannerDatastore/TestSimple/4 (0.18s)
--- PASS: TestSpannerDatastore/TestSimple/32 (0.37s)
--- PASS: TestSpannerDatastore/TestSimple/256 (3.97s)
v1.5.9 with change feed created:
--- PASS: TestSpannerDatastore/TestSimple (34.58s)
--- PASS: TestSpannerDatastore/TestSimple/1 (0.12s)
--- PASS: TestSpannerDatastore/TestSimple/2 (0.11s)
--- PASS: TestSpannerDatastore/TestSimple/4 (0.16s)
--- PASS: TestSpannerDatastore/TestSimple/32 (0.58s)
--- PASS: TestSpannerDatastore/TestSimple/256 (33.61s)
v1.5.9 with no change feed:
--- PASS: TestSpannerDatastore/TestSimple (6.58s)
--- PASS: TestSpannerDatastore/TestSimple/1 (0.13s)
--- PASS: TestSpannerDatastore/TestSimple/2 (0.13s)
--- PASS: TestSpannerDatastore/TestSimple/4 (0.20s)
--- PASS: TestSpannerDatastore/TestSimple/32 (0.69s)
--- PASS: TestSpannerDatastore/TestSimple/256 (5.43s)
Thanks for the additional information! We are looking into this issue. Another question to help us narrow things down is when you have change streams in schema, what was the size of the data in the insertion transaction?
@ChangyuLi28 In the above test that takes ~30s, we are inserting at least 256 rows in a single transaction, like so:
("test/resource","resource0","reader","test/user","user0","...")
("test/resource","resource1","reader","test/user","user1","...")
("test/resource","resource2","reader","test/user","user2","...")
("test/resource","resource3","reader","test/user","user3","...")
("test/resource","resource4","reader","test/user","user4","...")
("test/resource","resource5","reader","test/user","user5","...")
("test/resource","resource6","reader","test/user","user6","...")
("test/resource","resource7","reader","test/user","user7","...")
... additional similar rows ...
If you'd like to be able to reproduce yourself, checkout the SpiceDB repository and run this command from the internal/datastore/spanner
directory:
go test ./... -tags ci,docker -run TestSpannerDatastore/TestSimple -v
Hi Joseph, an improvement has been released in the most recent version. Thank you for identifying this issue and providing all the context! Let us know if there's any other issues.
Thanks @ChangyuLi28! Seems to work well now
Possibly related to: https://github.com/GoogleCloudPlatform/cloud-spanner-emulator/issues/74#issuecomment-1677972732
SpiceDB (https://github.com/authzed/spicedb) makes use of the emulator for our Spanner datastore tests. We attempted to move to emulator v1.5.9 to enable testing of our use of changefeeds, but ran into immediate CI timeouts after ~10m.
After some investigation, it appears that v1.5.9 of the emulator is quite a bit slower than v1.5.8 for transactions containing somewhat large (~100s) of row insertions; I suspect this is due to the recent code necessary for book keeping for emulation of change feeds.
For reference, a test against v1.5.8 that executes a transaction with 256 rows inserted:
vs against v1.5.9:
I've found that the source of the slowdown is the first insertion transaction, which takes ~30s on average to add the 256 rows.
This is not blocking as we have moved back to v1.5.8 and can always increase the CI timeout, but we figured we'd mention it in case it becomes an issue for others
For reference, our Spanner-calling code can be found at https://github.com/authzed/spicedb/tree/main/internal/datastore/spanner