pmem / pmdk

Persistent Memory Development Kit
https://pmem.io
Other
1.34k stars 510 forks source link

eatmydata — official no-flush setting #4218

Closed kilobyte closed 1 year ago

kilobyte commented 5 years ago

FEAT: official no-flush setting

Rationale

For traditional filesystem API, there's a popular tool eatmydata which LD_PRELOADs a library to ignore calls to fsync, sync_file_range, msync and so on. For real disks, there are some non-contrived real-life loads for which this can improve performance even by a factor of 10-20×, obviously at the cost of losing crash consistency. But for some use cases, crash consistency doesn't matter — interrupted tasks will need to be restarted from scratch anyway, etc. Shallow flushes are not this bad on pmem, but hey — a cheap speed-up!

Description

Thus, let's declare an official method of disabling flushes. There's PMEM_NO_FLUSH which already is almost good enough (doesn't disable deep flushes yet), but that's just one of many testing env vars. Should we declare it an official external interface?

If so, what should be done to deep flushes? Options include PMEM_NO_FLUSH=2 to disable everything, or even a bit field: 2 to disable deep flushes only, 3 for 1+2 (both kinds).

API Changes

Either make PMEM_NO_FLUSH official, with support for deep flushes as above — or design something new.

Implementation details

On our side, implementation of either variant is obvious. But for wrapping in a tool, I'd propose sending a patch for original eatmydata to avoid a separate tool for disks and pmem.

kilobyte commented 5 years ago

As discussed with @marcinslusarz — current PMEM_NO_FLUSH is supposed to be a debug-only variable for now, thus I'm not talking to eatmydata's maintainer yet. Let's thus think what a good public interface would be.

pbalcer commented 3 years ago

@kilobyte can you either close this or address the problem?

kilobyte commented 3 years ago

Implementation is trivial, it's a design issue.

Either we bless PMEM_NO_FLUSH as official non-debug variable, working with libpmem1 and libpmem2, or add a new one for this purpose.

janekmi commented 1 year ago

If you consider this question still important to you please reopen the issue and provide more context for your request so we can reassess its priority.

kilobyte commented 1 year ago

I wonder: with development pace slowed down, env variables like this are also less likely to change their meaning. Thus, something that was meant to be only a chicken bit for debugging might as well be used semi-officially. In worst case, it will just degrade to no-op, resulting in full flushes being done — something that's slower but with no other detrimental effects.

Ie, if you can promise that PMEM_NO_FLUSH will either keep working or do nothing, but will never cause the machine to go up in flames or trigger an alien invasion, that's good enough.

In other words: should I