Open umanwizard opened 4 years ago
could we do something like keep a bloom filter of hashes of all messages received and warn/increment a metric every time we get a hit? Could we get a reasonable signal to noise ratio taking various unique indexes into account?
You could, but you’d still be consuming memory proportional to each element. I think it’d be simpler to just keep around a set of all the primary keys you’ve seen, in that case.
you’d still be consuming memory proportional to each element
Isn't the proportion logarithmic instead of linear, though?
I'm not an expert on bloom filters, but my understanding is that they use memory (linearly) proportional to the number of elements in the set.
Yeah in retrospect I don't actually know how that could possibly work out mathematically. Looking it up, they take a pretty small mount of space, but possibly several bytes per element to get down to a level of certainty that is worth having.
I'm +1 on the idea that we make Materialize good for this. Not that we necessarily automatically do it for folks, but we did eventually largely use MZ to diagnose the problem. There were some false starts, where we used SELECT
and so only got point-in-time answers, and where we had to dance around MZ's optimizer wrt primary keys, but I think in principle many of the issues could have been tracked down by someone with a recipe book of MZ queries to issue.
@frankmcsherry IMO this will be much easier to debug with Materialize if/when nested records ever land. Because then your downstream Debezium-flavored view will just be some relational transform over your "avro decode" and "kafka ingest" views, which the user here would have be able to write queries against directly, had they existed.
Can we introduce something like MySQL's CHECK TABLE, which is a manually triggered way to validate a table? Our equivalent would be a new command like CHECK SOURCE
, which reads all data from a source and checks for invalid multiplicities and other issues that would impede view maintenance by Materialize.
When working with a user who observed incorrectly duplicated records as well as crashes in Materialize, we ultimately discovered that they were suffering from #3026 . This involved a few hours of confusion and manual debugging, which is a poor user experience.
Ideally, we would come up with a way to make this process much smoother.