Closed ptrsmrn closed 6 months ago
For the purpose of reproducing issues, it would also be good to be able to replay the queries from the log.
For that I believe that if the log would be of CQL commands (USING TIMESTAMP
), we could just use cqlsh to replay them.
@ptrsmrn , @bhalevy, I have different perspective on the issue, let's discuss it:
500
rq/s, it is going to generate 1800000
records over an hour, smallest sct test is 4h test, so you gonna get 7200000
, which is in the very very best case translates to 196
mb of data, which is hard to store, retrieve and analyze.Instead of logging (or in parallel to it) I would suggest the following:
sqlite
), in such manner that it can be found by pk/ckReproducibility problem should be addressed by setting seeds
, unfortunately it does not work properly, https://github.com/scylladb/gemini/issues/370, but once issue is fixed you should be able to reproduce exact CQL flow by just setting --schema-seed
and --seed
@ptrsmrn , @bhalevy, I have different perspective on the issue, let's discuss it:
- Issue is 100% valid and I have been thinking on how to solve it for quite long time.
- I don't think logging sql statements is not best solution, since it is going to have tons of data, most of which is not usefull Say it runs modest
500
rq/s, it is going to generate1800000
records over an hour, smallest sct test is 4h test, so you gonna get7200000
, which is in the very very best case translates to196
mb of data, which is hard to store, retrieve and analyze.
196 MB are not hard to store.
As for analyzing the log, leave it to the engineer who's working on the issue. One can use other tools like grep or write scripts to automate parts of the analysis. But we need the raw data to start with. Hiding stuff based on what we know today doesn't help since you will need that data for analyzing future problems that are unknown yet, so I would suggest to "expect the unexpected".
Instead of logging (or in parallel to it) I would suggest the following:
- Every query is remembered by gemini (memory is not an option, we can go with
sqlite
), in such manner that it can be found by pk/ck- Test-wide queries, like schema changes, are stored separately
- When error happens gemini pulls all the queries that where executed for given pk/ck with test-wide queries and logs/prints them
Reproducibility problem should be addressed by setting
seeds
, unfortunately it does not work properly, #370, but once issue is fixed you should be able to reproduce exact CQL flow by just setting--schema-seed
and--seed
@dkropachev Saving logs only for failed tests makes sense, given their size. How gemini framework stores them internally, a file or sqlite, is not a concern to devs as long as devs don't need to query sqlite, so gemini logging them onto the screen/file or anything alike, as you suggested, is IMO good - essentially I'd just make these extra logs easily available for devs, so we don't hear further complaints ;) A note regarding the file size - best to not produce too big files that are hard to open or are bloated with "spam", because it'll make debugging process harder, but that's just a wishful thinking, as I understand each test produces tons of queries.
Given: https://github.com/scylladb/scylla-enterprise/issues/3642#issuecomment-1849642826:
https://github.com/scylladb/scylla-enterprise/issues/3642#issuecomment-1838697317:
https://github.com/scylladb/scylla-enterprise/issues/3642#issuecomment-1838820807:
can we improve logging to the point that the cited issue and alike issues are (easily) debuggable? That would include:
cc @mykaul @nuivall