FirebirdSQL / firebird

Firebird server, client and tools
https://firebirdsql.org
1.26k stars 217 forks source link

[FR]: Prepare to remove MaxUnfushed* settings #8264

Open basid-irk opened 1 month ago

basid-irk commented 1 month ago

MaxUnflushedWrites and MaxUnflushedWriteTime settings exist only as workaround bug of filesystem cache on Windows 95/98:

// common/config/config.cpp (Firebird 3.0)
// common/config/config.h   (Firebird 4.0+) 
#ifdef WIN_NT
    {TYPE_INTEGER,      "MaxUnflushedWrites",       (ConfigValue) 100},
    {TYPE_INTEGER,      "MaxUnflushedWriteTime",    (ConfigValue) 5},
#else
    {TYPE_INTEGER,      "MaxUnflushedWrites",       (ConfigValue) -1},
    {TYPE_INTEGER,      "MaxUnflushedWriteTime",    (ConfigValue) -1},
#endif

ifdef should be removed and leave only:

    {TYPE_INTEGER,      "MaxUnflushedWrites",       (ConfigValue) -1},
    {TYPE_INTEGER,      "MaxUnflushedWriteTime",    (ConfigValue) -1},

Comment's in firebird.conf should be changed too. F.e.:

 # ----------------------------
 #
 # How often the pages are flushed on disk
 # (for databases with ForcedWrites=Off only)
 #
 # Number of unflushed writes which will accumulate before they are
 # flushed, at the next transaction commit.
 #
 # Per-database configurable.
 #
 # Type: integer
 #
 #MaxUnflushedWrites = -1  # engine controlled

 #
 # Number of seconds during which unflushed writes will accumulate
 # before they are flushed, at the next transaction commit.
 #
 # Per-database configurable.
 #
 # Type: integer
 #
 #MaxUnflushedWriteTime = -1  # engine controlled
mrotteveel commented 1 month ago

Is that truly only a Windows 95/98 era issue? I seem to recall that certain configurations on Windows NT and higher (e.g. Windows Server used as a domain controller), also had this issue.

hvlad commented 1 month ago

Could you explain - what exact "bug of filesystem cache on Windows 95/98" you refers to ? What exactly is wrong in current settings values ?

mrotteveel commented 1 month ago

@hvlad If I recall correctly, if you disable forced writes, on some Windows versions and/or specific Windows configurations, physical disk writes may be delayed indefinitely (e.g. until file close), and so crashes or power failures could lead to severe data loss.

basid-irk commented 1 month ago

2Mark - Windows Server Domain Controller disable writeback cache -> slow perfomance. 2Vlad - AFAIK, Windows 95/98 "physical disk writes may be delayed indefinitely". At present time no sense use MaxUnflushed* settings, different between Windows and POSIX.

hvlad commented 1 month ago

How it is known that that "bug" was fixed in next Windows versions ? Any documented proof ? Anything except of AFAIK and\or IIRC ?

Before make such a risky and far not obvious change we need to have really strong arguments, not guesses or someone wishes.

Filesystem internals is very very poor documented and could be changed without any kind of warnings or hints. And it is really changed many times already. I see no real reason to change default value for setting that have more-or-less safe values with more-or-less good compromise between safety and performance. It could be changed by any "knowing" DBA when necessary.

basid-irk commented 1 month ago

Certain older versions of Microsoft Windows may incorrectly report storage device synchronous write completion if the Windows default Write Cache Enabled setting is used. Synchronous Write Policy.

This problem may occur if the Windows 2000 hard disk driver does not correctly process write caching (KB281672).

support.microsoft.com/kb/332023 Note: currently this document only mentions WIndows 2000 but it is also applicable to Windows 2003 as well. This change was implemented by Microsoft in an effort to protect users from data corruption. Before the change it was too easy to use the internal cache of a physical disk to increase performance. This exposed customers to potential data loss in the event of a power outage because that cache was not backed up by batteries. The fix implemented by Microsoft treated all disks (logical and physical) the same. The net result is that Windows based servers will attempt to disable Write Cache on any disk it sees including LUNs from an array with battery backup for it's controller cache.

hvlad commented 1 month ago

I see nothing common between "physical disk writes may be delayed indefinitely (e.g. until file close)" and "Windows may incorrectly report storage device synchronous write completion".

The "bug" we trying to avoid with MaxUnflushedXXX settings is related with "OS lazy cache writer" behaviour. It is claimed that it might never write dirty pages until file is closed, thus we trying to force such writes. There is no proof or disproof or documented desription of how and when "OS lazy cache writer" decides to write dirty pages.

And there is no strong arguments (actually there is no ANY arguments) of why we should change defaults of MaxUnflushedXXX settings or remove its, still.

aafemt commented 1 month ago

According to this question Linux can experience serious performance issues if a lot of pages are unflushed. Perhaps you should do exactly contrary to suggested: turn these settings ON by default on all platforms.

basid-irk commented 1 month ago

Async-write (ForsedWrite=Off) as such - dangerous capabilities. But Firebird use different default settings on different platform. POSIX document OS lazy cache writer behavior?

P.S.

Limit Linux background flush (dirty pages) Asked 14 years, 6 months ago Modified 5 years, 7 months ago

How it related with present days? Linux Kernel developers fully ignored this problem or it's not problem at all?

basid-irk commented 1 month ago

Performance Tuning Cache and Memory Manager

The policy of delaying the writing of the data to the file and holding it in the cache until the cache is flushed is called lazy writing, and it is triggered by the Cache Manager at a determinate time interval.

mrotteveel commented 1 month ago

I don't think it is that simple, as that same page says:

When the Cache Manager has determined that the data will no longer be needed for a certain amount of time, it writes the altered data back to the file on the disk, as shown by the dotted arrow between the system cache and the disk.

In other words, it sounds like a data page might not get written back to disk if it falls out of the Firebird page cache, but is still hot enough to be retrieved from the file system cache frequently enough.

basid-irk commented 1 month ago

If not write often modified data, then (in database reality) we get huge amount dirty page and problem "very long time for flush modified page on file close". It's problem "we are too optimistic". Yes, not every write often modified date will backed up on physical storage, but such writes will be "often enougth".

aafemt commented 1 month ago

Firebird has its own background cache writer using low I/O priority, so "huge amount of dirty pages" situation has low probability, no?

basid-irk commented 1 month ago

I (at least I) speak about data in (file)system cache.

hvlad commented 1 month ago

Purposes of Firebird's CacheWriter is completely different and it doesn't flush OS cache. It just writes dirty pages from LRU tail to keep it clean. It could be changed, of course, but this is another theme.

aafemt commented 1 month ago

I (at least I) speak about data in (file)system cache.

But titular settings are not "about data in (file)system cache", they merely affect Firebird page cache only.