Closed didac-adria closed 6 months ago
Hey @didac-adria,
It sounds like switching to the Redis ingest was the right decision for the size of your application :+1:
We just tried to run optimize table pulse_aggregates
on one of our high-traffic applications, and we experienced the same error. This is because the database needs to perform a "recreate and analyze" while new data is still coming into the table with potentially conflicting keys. The optimize
process doesn't know how to resolve new conflicts like we do when upserting data. We were able to solve this by stopping the pulse:work
command temporarily to prevent new data from entering the table during the optimize process. The Redis ingest will continue to collect a backlog of entries during this time so no new data will be lost. Once completed successfully, we restarted the pulse:work
command to store the backlog of entries (which uses an upsert query to resolve conflicts) before it resumes normal operation.
With regards to the virtual/stored column, we actually followed PlanetScale's approach from https://planetscale.com/blog/generated-hash-columns-in-mysql. It's our understanding that while the column itself is virtual, the computed value is still stored in the index for that column. You are welcome to update your Pulse migration file to use stored
on that column if you wish, and we'd be curious to know whether you experience any benefits if you decide to do that.
Hope that helps!
Hey @jessarcher !
Thank you so much for going into my issue. Your response was so helpful. I did what you suggested and worked immediately. Thank you for putting some light into this.
Pulse Version
beta14
Laravel Version
10.47
PHP Version
8.2
Livewire Version
3.4.8
Database Driver & Version
We run on Planet Scale
Description
When running "optimize table pulse_aggregates" mysql command in the table, it fails because an index. In particular with the pulse_aggregates_bucket_period_type_aggregate_key_hash_unique index. The error which pops up is: Duplicate entry '1710328680-360-Threads_connected-count-NULL' for key 'pulse_aggregates.pulse_aggregates_bucket_period_type_aggregate_key_hash_unique'
Investigating with the Planet Scale support, looks like the issue may come from the fact that the key_hash column is virtual, instead of generated. This could potentially allow duplicated records of values without making the unique index rise the hand saying that this is not allowed. Thus, we may have duplicated records which makes the command to fail.
Steps To Reproduce
The full story is that we enabled pulse for the database. The database started to grow so fast. We realized about that and we discovered a bug that was preventing pulse to empty the DB. Such bug was reported here and solved by you guys. Before such bug was resolved, we needed to implement an urgent solution: delete DB records ourselves regularly. Finally, we decided to stop using pulse in the database and we switched to redis ingestor. Thus, most of the records in the pulse_aggregates table were deleted. But the database didn't shrink because the file size was already as big as when we had all records in the table. So now we want to run the optimize command to lower our consumed space.
So I guess that if you put some records in the pulse_aggregates table. You run a query to delete them, and you start using redis ingestor, you may be able to reproduce it, but I'm not sure.
I hope that helps.