MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.72k stars 466 forks source link

0dt/m1: make envd startup fast enough to be imperceptible #28722

Open jkosh44 opened 1 month ago

jkosh44 commented 1 month ago

Feature request

During an upgrade the time spent on startup is all time contributing to downtime. We need to reduce startup times to reduce downtime.

### Work items
- [ ] https://github.com/MaterializeInc/cloud/issues/10024
- [ ] https://github.com/MaterializeInc/materialize/issues/29357
- [ ] https://github.com/MaterializeInc/materialize/pull/29226
- [ ] https://github.com/MaterializeInc/materialize/pull/29302
- [ ] https://github.com/MaterializeInc/materialize/pull/29378
- [ ] https://github.com/MaterializeInc/materialize/pull/29381
- [ ] https://github.com/MaterializeInc/materialize/pull/29387
- [ ] https://github.com/MaterializeInc/materialize/pull/29403
- [ ] https://github.com/MaterializeInc/materialize/pull/29081
- [ ] https://github.com/MaterializeInc/materialize/pull/29127
- [ ] https://github.com/MaterializeInc/materialize/pull/29210
- [ ] https://github.com/MaterializeInc/materialize/pull/28208
- [ ] https://github.com/MaterializeInc/materialize/pull/28612
- [ ] https://github.com/MaterializeInc/materialize/pull/28868
- [ ] https://github.com/MaterializeInc/materialize/pull/28978
- [ ] https://github.com/MaterializeInc/materialize/pull/28981
- [ ] https://github.com/MaterializeInc/materialize/pull/28984
- [ ] https://github.com/MaterializeInc/materialize/pull/29410
jkosh44 commented 1 month ago

Here are the logs from the staging env that took the longest to startup at 38 seconds (old format): staging_logs.txt

The reason each log appears twice is because the first set is from read-only mode while the second set is from read-write mode.

The prod sandbox started up in 6 seconds, so probably not as interesting as the attached staging logs.

jkosh44 commented 1 month ago

The following hacky python snippet will sort the logs by time (you may have to play around with the sep value and the arg to main):

The following hacky python snippet will sort the logs by time (you may have to play around with the `sep` value and the arg to `main`):

```Python
import re
from typing import Optional
from dateutil.parser import parse

order = [
    r'startup: envd init: beginning',
    r'startup: envd init: preamble beginning',
    r'startup: envd init: preamble complete in',
    r'startup: envd init: serving beginning',
    r'startup: envd serve: beginning',
    r'startup: envd serve: preamble beginning',
    r'startup: envd serve: preamble complete in',
    r'startup: envd serve: catalog init beginning',
    r'startup: envd serve: catalog init complete in',
    r'startup: envd serve: system parameter sync beginning',
    r'startup: envd serve: system parameter sync complete in',
    r'startup: envd serve: preflight checks beginning',
    r'startup: envd serve: preflight checks complete in',
    r'startup: envd serve: durable catalog open beginning',
    r'startup: envd serve: durable catalog open complete in',
    r'startup: envd serve: coordinator init beginning',
    r'startup: coordinator init: beginning',
    r'startup: coordinator init: preamble beginning',
    r'startup: coordinator init: preamble complete in',
    r'startup: coordinator init: timestamp oracle init beginning',
    r'startup: coordinator init: timestamp oracle init complete in',
    r'startup: coordinator init: catalog open beginning',
    r'startup: coordinator init: catalog open complete in',
    r'startup: coordinator init: coordinator thread start beginning',
    r'startup: controller init: beginning',
    r'startup: controller init: complete in',
    r'startup: coordinator init: bootstrap beginning',
    r'startup: coordinator init: bootstrap: preamble beginning',
    r'startup: coordinator init: bootstrap: preamble complete in',
    r'startup: coordinator init: bootstrap: storage collections init beginning',
    r'startup: coordinator init: bootstrap: storage collections init complete in',
    r'startup: coordinator init: bootstrap: optimize dataflow plans beginning',
    r'startup: coordinator init: bootstrap: optimize dataflow plans complete in',
    r'startup: coordinator init: bootstrap: dataflow as-of bootstrapping beginning',
    r'startup: coordinator init: bootstrap: dataflow as-of bootstrapping complete in',
    r'startup: coordinator init: bootstrap: postamble beginning',
    r'startup: coordinator init: bootstrap: postamble complete in',
    r'startup: coordinator init: bootstrap: generate builtin updates beginning',
    r'startup: coordinator init: bootstrap: generate builtin updates complete in',
    r'startup: coordinator init: bootstrap: generate secret cleanup beginning',
    r'startup: coordinator init: bootstrap: generate secret cleanup complete in',
    r'startup: coordinator init: bootstrap: concurrently update builtin tables and cleanup secrets beginning',
    r'startup: coordinator init: bootstrap: concurrently update builtin tables and cleanup secrets complete in',
    r'startup: coordinator init: bootstrap complete in',
    r'startup: coord serve: storage usage prune beginning',
    r'startup: coord serve: storage usage prune complete in',
    r'startup: coordinator init: coordinator thread start complete in',
    r'startup: coordinator init: complete in',
    r'startup: envd serve: coordinator init complete in',
    r'startup: envd serve: postamble beginning',
    r'startup: envd serve: postamble complete in',
    r'startup: envd serve: complete in',
    r'startup: envd init: serving complete in',
    r'startup: envd init: complete in',
]

def main(filename: str):
    logs = sort_logs(filename)
    logs = '\n'.join(logs)
    with open(filename, "w") as file:
        file.write(logs)

def sort_logs(filename: str):
    with open(filename, "r") as file:
        lines = [line.strip() for line in file if re.match(r'\d+-\d+-\d+ \d+:\d+:\d+.\d+', line) and log_idx(line) is not None]
        lines.sort(key=sort_log)
        return lines

def sort_log(line: str):
    sep = "\t: "
    split_line = line.split(sep, 1)
    d = parse(split_line[0])
    idx = log_idx(line)
    return (d, idx)

def log_idx(line: str) -> Optional[int]:
    for idx, template in enumerate(order):
        pattern = re.compile(template)
        if pattern.search(line):
            return idx

    return None

if __name__ == "__main__":
    main("/home/joe/Downloads/us-prod-startup-0896c.txt")
jkosh44 commented 1 month ago

The startup process is broken down into the following categories, each with their own timings

Note: these exact sections were introduced in https://github.com/MaterializeInc/materialize/pull/28978

jkosh44 commented 4 weeks ago

Attached is the startup timings in prod for the customer with the largest catalog (old format).

There are two major differences with staging due to 0dt being enabled in staging but not prod:

prod_logs.txt

jkosh44 commented 3 weeks ago

mz_storage_usage_by_shard is currently a builtin table that is backed by the storage usage collection collection in the durable catalog. The storage usage collection takes up 99% of the space in the durable catalog. The mz_storage_usage_by_shard table has the following columns:

In order to have correct billing logic, we need the following gaurantees:

The way we currently maintain these gaurantees is by first generating the collection_timestamp via the timestamp oracle, and then generating the logical write timestamp via the timestamp oracle.

In order to speed up startup times, we'd like to remove the storage usage collection from the durable catalog. Here are some proposals for that in the order of my preference for each proposal:

For all approaches we'll have a background task that occasionally removes entries older than 30 days.

benesch commented 3 weeks ago
  • Keep mz_storage_usage_by_shard as a builtin table, but prevent anyone from ever migrating it and don't truncate it on startup. This seems like the simplest approach, however we tried to do the same thing with the query activity log and ultimately decided to make it a builtin source.

This is my preference as well, especially since our forensic analysis has indicated that making the query history a builtin source was likely simply a result of not realizing that we could have builtin tables that didn't need to be truncated on startup.

aljoscha commented 3 weeks ago
  • Keep mz_storage_usage_by_shard as a builtin table, but prevent anyone from ever migrating it and don't truncate it on startup. This seems like the simplest approach, however we tried to do the same thing with the query activity log and ultimately decided to make it a builtin source.

This would also be my first preference, yes!

jkosh44 commented 1 week ago

I collected some parsing, planning, and optimization timings from my staging account. I created 1000 tables of the form CREATE TABLE t{i} (a INT);, 1000 views of the form CREATE VIEW v{i} AS SELECT SUM(a) FROM t{i}, and 1000 indexes of the form CREATE DEFAULT INDEX ON v{i}. The total startup time is 17 seconds. Of thost 17 seconds, we spend the following amount of time on the following activities:

def- commented 1 week ago

So what is most of the time spent on? Your activities only seem to sum up to ~5 s of the 17 s total.

jkosh44 commented 1 week ago

So what is most of the time spent on? Your activities only seem to sum up to ~5 s of the 17 s total.

I'm having trouble with logs getting dropped in my staging account, so I may have missed a couple of events, but here are the other large activities: