Web browser is important but there is no course for how to build it.
Database is important but it's hard to work with it correctly.
02 - Advanced SQL
SQL-92 is the minimum that a DBMS has to support in order to claim they support SQL.
SQL is based on bags (duplicates) not sets (no duplicates).
03 - Database Storage I
Still, a Jean Dean list:
0.5 ns L1 Cache Ref
7 ns L2 Cache Ref
100 ns DRAM
150,000 ns SSD
10,000,000 ns HDD
~30,000,000 ns Network Storage
1,000,000,000 ns Tape Archives
Why not use mmap in DB?
OS know nothing about DB
You can give OS some hints by using following commands
madvise: Tell the OS how you expect to read certain pages
mlock: Tell the OS that memory ranges cannot be paged out.
msync: Tell the OS to flush memory ranges out to disk.
Problems:
What if we allow multiple threads to access the mmap files to hide page fault stalls
This works good enough for read-only access. It is complicated when there are multiple writers
Which DBMS use mmap?
ElasticSearch, ravendb, MonetDB, LevelDB (but no RocksDB)
Which DBMS partially use mmap?
MemSQL, InfluxDB, MongoDB, SQLite
DBMS is better than OS in:
Flushing dirty pages to disk in the correct order.
Specialized prefetching.
Buffer replacement policy.
Thread/process scheduling.
There are three different notions of "pages" in a DBMS:
Hardware Page (usually 4KB)
OS Page (usually 4KB)
Database Page (512B-16KB)
04 - Database Storage II
Blob
Some systems allow you to store a really large value in an external file. Treated as a BLOB type.
The DBMS cannot manipulate the contents of an external file.
No durability protections.
No transaction protections.
Someone who maintained SQLite said that many users store up to 1 MB blob to SQLite since it's faster. The database already has the file open, rather than fetch the file pointer to the file system and then fetch the blob.
Database storage is super expensive since to ensure the data is safe you may use high-end storage.
05 - Buffer Pools + Memory Management
Locks and Latches
Locks are held for transaction duration. Need to be able to rollback changes.
Latches protect the critical sections of the DBMS's internal data structure from other threads. Held for operation duration.
Optimization
Prefetch (based on what the query wants)
The DBMS can also prefetch pages based on a query plan
Sequential Scans (for example, we need page 1, page 2, page 5...)
https://www.youtube.com/watch?v=oeYBdghaIjc&list=PLSE8ODhjZXjbohkNBWQs_otTrBTrjyohi