westerndigitalcorporation / zenfs

ZenFS is a storage backend for RocksDB that enables support for ZNS SSDs and SMR HDDs.
GNU General Public License v2.0
238 stars 87 forks source link

fs: add a Path class and remove c++17 dependency #192

Closed royguo closed 2 years ago

royguo commented 2 years ago

Signed-off-by: Kuankuan Guo guokuankuan@bytedance.com

metaspace commented 2 years ago

I strongly object to replicating library code inside ZenFS. #193 is a better solution since it picks the functionality from an existing well reviewed library. If using boost is not acceptable for TerakDB, perhaps TerakDB could maintain the replicated Path class in their own tree and we could include that instead of boost if building for TerakDB?

royguo commented 2 years ago

@metaspace Understand, I already put this patch into TerarkDB's repo

MatiasBjorling commented 2 years ago

Quick question.

ZenFS is a common library for user-space applications, can it really require that applications uses c++17?

Could it make sense that it is only required in case that the app already makes use of c++17 functionality?

On Tue, 17 May 2022, 17:25 Roy Guo, @.***> wrote:

@metaspace https://github.com/metaspace Understand, I already put this patch into TerarkDB's repo

— Reply to this email directly, view it on GitHub https://github.com/westerndigitalcorporation/zenfs/pull/192#issuecomment-1129009209, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEYEYZRWPRHXYGVK2EDFULVKO26RANCNFSM5WB6WJMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

royguo commented 2 years ago

@MatiasBjorling @metaspace We will upgrade to C++17 if it helps us get much better performance (e.g, >=20%), or resolves some existing bugs.....But for ZenFS, seems the only reason is to use std::filesystem::path (which I think is not worth upgrading to C++17)

I agree with @MatiasBjorling because there are a bunch of existing open-source projects that are still using old versions of RocksDB(e.g. Ceph), we cannot assume them having C++17 or boost library (unless this std::filesystem::path is not replaceable....)

metaspace commented 2 years ago

Quick question. ZenFS is a common library for user-space applications, can it really require that applications uses c++17? Could it make sense that it is only required in case that the app already makes use of c++17 functionality?

@MatiasBjorling As I stated elsewhere c++17 was stable in LLVM and GCC 4 years ago. It's not really bleeding edge stuff here.

We moved to c++17 because RocsDB requires c++17. There is no way to even add compiler flags when building in RocksDB that forbids c++17 features.

The reason this is causing problems for us is that ZenFS is not a proper library. It is a plugin distributed as a source that you compile together with your plugin host by adding the sources to your plugin host source tree. We are relying on whatever build system the plugin host provides, and we must be compiled with the same compiler as the plugin host (because c++ has no stable ABI). If we were to distribute ZenFS as a proper library with a C interface, none of this would be an issue.

If we were to keep ZenFS at c++11, we could add a CI step that builds ZenFS in TerakDB, to catch use of features >= c++17. I don't like that solution, because c++17 is a quality of life improvement over c++11. I think TerakDB should just update to c++17 or reimplement whatever c++17 or even boost library features that they need inside TerakDB code base.

If we decide to keep ZenFS at ancient c++11 standard, and if we can have no dependency on boost, we are going to end up reimplementing library code inside ZenFS. It is more code to review and maintain, more places for bugs. It is work that we do not actually have to deal with if we just go to c++17.

MatiasBjorling commented 2 years ago

Quick question. ZenFS is a common library for user-space applications, can it really require that applications uses c++17? Could it make sense that it is only required in case that the app already makes use of c++17 functionality?

@MatiasBjorling As I stated elsewhere c++17 was stable in LLVM and GCC 4 years ago. It's not really bleeding edge stuff here.

You're right it is not bleeding edge, I am not disputing that c++17 as a standard is fine and ready to be used for production.

We moved to c++17 because RocsDB requires c++17. There is no way to even add compiler flags when building in RocksDB that forbids c++17 features.

The code could distinguish using the __cplusplus constants.

The reason this is causing problems for us is that ZenFS is not a proper library. It is a plugin distributed as a source that you compile together with your plugin host by adding the sources to your plugin host source tree. We are relying on whatever build system the plugin host provides, and we must be compiled with the same compiler as the plugin host (because c++ has no stable ABI). If we were to distribute ZenFS as a proper library with a C interface, none of this would be an issue.

Ok, that might be something to work towards.

If we were to keep ZenFS at c++11, we could add a CI step that builds ZenFS in TerakDB, to catch use of features >= c++17. I don't like that solution, because c++17 is a quality of life improvement over c++11. I think TerakDB should just update to c++17 or reimplement whatever c++17 or even boost library features that they need inside TerakDB code base.

While I always strive to improve quality of life, we also have to consider that other types of quality of life can come from staying with existing frameworks and inherited stability.

For example, new releases of RocksDB 6.29 are contiguously made after the release of 7.0, therefore, should the CI therefore continue to test 6.29.x?

I am not proposing to avoid c++17 features, I propose to use it when compiled with projects where a specific standard is supported (newer standard). For example, if RocksDB 7.x compiled ZenFS, and it requires c++17 features at the API level, then zenfs code would need to have that ifdef'ed appropriately. For RocksDB 6.9.x and TerarkDB, c++17 features wouldn't be used, and therefore would work anyway.

If we decide to keep ZenFS at ancient c++11 standard, and if we can have no dependency on boost, we are going to end up reimplementing library code inside ZenFS. It is more code to review and maintain, more places for bugs. It is work that we do not actually have to deal with if we just go to c++17.

I believe that right now, zenfs code would be required to only use features available in the lowest common denominator c++ standard - i.e., TerarkDB and RocksDB 6.29 adhere to the language features defined up to c++11, whereas RocksDB 7+ can make use of c++17 specific features.

@royguo

To avoid having to open code file system access in zenfs, would boost:filesystem be an acceptable dependency to TerarkDB? or will that cause other problems?

MatiasBjorling commented 2 years ago

@royguo I see that TerarkDB already has a dependency on boost::filesystem - So that would be okay to use?

metaspace commented 2 years ago

For example, new releases of RocksDB 6.29 are contiguously made after the release of 7.0, therefore, should the CI therefore continue to test 6.29.x?

@MatiasBjorling ZenFS 2.x is not compatible with RocksDB 6.x because there were breaking API changes between RocksDB 6 and 7. ZenFS 1.x releases continue to support RocksDB 6 but is not compatible with RocksDB 7 because of those API changes. CI for 1.x releases build with RocksDB 6.x.

Backporting ZenFS 2 changes to ZenFS 1 makes no sense, since these were the changes required a major version bump. I guess we could continue to support RocksDB 6 in ZenFS 2, but that is not something we are currently discussing. Perhaps we should?

metaspace commented 2 years ago

We moved to c++17 because RocsDB requires c++17. There is no way to even add compiler flags when building in RocksDB that forbids c++17 features.

The code could distinguish using the __cplusplus constants.

We would need a way to build in CI with both c++11 and c++17. We do not currently have a way to do that. But we could do a build of ZenFS in TerarkDB as long as TerarkDB is on c++11. We cannot just use older RocksDB, because APIs have changed in ways that we would just get build errors if we tried.

metaspace commented 2 years ago

Ok, that might be something to work towards.

This is the way. But it is a lot of work.

yhr commented 2 years ago

@royguo I see that TerarkDB already has a dependency on boost::filesystem - So that would be okay to use?

@royguo told me that they have dropped the boost dependency.

@metaspace As we ifdef the implementaion of the breaking API change for plugin registration, we are still compatible with rocksdb 6, no?

metaspace commented 2 years ago

I guess we could continue to support RocksDB 6 in ZenFS 2, but that is not something we are currently discussing. Perhaps we should?

It seems I am wrong. We actually do support both RocksDB 6 and 7 APIs in ZenFS 2. The only thing that prevents us from building in vanilla RocksDB 6 is the C++17 features we use.

yhr commented 2 years ago

Fixed in #193