tikv / raft-engine

A persistent storage engine for Multi-Raft log
Apache License 2.0
565 stars 88 forks source link

Unify File abstraction #165

Closed tabokie closed 2 years ago

tabokie commented 2 years ago

In order to support file system extension, e.g. encryption and rate limiting, we introduced the FileBuilder trait in #96. It is built on top of std::io::Read and Write, which doesn't provide any low-level functionality required for a high-performance log writer. So we kept the usage of old implementation LogFd along with the new abstraction.

Moving forward, it's better to unify the both into a new set of abstractions:

pub trait FileSystem: Send + Sync {
  type Handle: Clone + Send + Sync;
  type Reader: Seek + Read + Send;
  type Writer: Seek + Write + WriteExt + Send;

  fn create<P: AsRef<Path>>(&self, path: P) -> Handle;
  fn open<P: AsRef<Path>>(&self, path: P) -> Handle;
  fn new_reader(&self, handle: &Handle) -> Reader;
  fn new_writer(&self, handle: &Handle) -> Writer;
  fn file_size(&self, handle: &Handle) -> IoResult<usize>;
}

pub trait WriteExt {
  fn finish(&self) -> IoResult<()>;
}
ztelur commented 2 years ago

Do you mean unifying LogFd and FileBuilder together? It seems that LogFd has read and write functions too. Replace them with the function in Reader and Writer?

pub fn read(&self, mut offset: usize, buf: &mut [u8]) -> IoResult<usize>
pub fn write(&self, mut offset: usize, content: &[u8]) -> IoResult<usize> {
tabokie commented 2 years ago

@ztelur Yes, replace them with trait functions in Seek + Read and Seek + Write.

ztelur commented 2 years ago

Ok, let me try.

ztelur commented 2 years ago

It seems that a lot of changes are needed. Before started, I need to make sure that my ideas are right or same with yours .

the fully defined FileSystem trait is like below

pub trait FileSystem: Send + Sync {
    type Handle: Clone + Send + Sync;
    type Reader: Seek + Read + Send;
    type Writer: Seek + Write + WriteExt + Send;
    // low level, from  LogFd
    fn create<P: AsRef<Path>>(&self, path: P) -> Self::Handle;
    fn open<P: AsRef<Path>>(&self, path: P) -> Self::Handle;
    fn file_size(&self, handle: &Self::Handle) -> IoResult<usize>;
    fn close(&self, handle: &Self::Handle) -> IoResult<()>;
    fn sync(&self, handle: &Self::Handle) -> IoResult<()>;
    fn truncate(&self, handle: &Self::Handle) -> IoResult<usize>;
    fn allocate(&self, handle: &Self::Handle, offset: usize, size: usize) -> IoResult<()>;
    // high level, from FileBuilder 
    fn new_reader(&self, handle: &Self::Handle) -> Self::Reader;
    fn new_writer(&self, handle: &Self::Handle) -> Self::Writer;
}

And, LogFd should be an implement of Handle, meanwhile LogFile is an implement of Writer/Reader. And DefaultFileSystem will return them in factory function.

// log_file 
pub fn build_file_reader<B: FileBuilder>(
    builder: &B,
    path: &Path,
    fd: Arc<LogFd>,
) -> Result<LogFileReader<B>> {
    let raw_reader = LogFile::new(fd.clone());
    let reader = builder.build_reader(path, raw_reader)?;
    LogFileReader::open(fd, reader)
}

Would the code block above become like below after refactoring.

pub fn build_file_reader<B: FileSystem>(
    system: &B,
    path: &Path,
) -> Result<LogFileReader<B>> {
    let handle = system.open(path.clone());  // DefaultFileSystem returns a instance of `LogFd`
    let reader = builder.build_reader(path, handle)?; //  DefaultFileSystem return a instance of `LogFile`
    LogFileReader::open(handle, reader)
}
tabokie commented 2 years ago

@ztelur Thanks for checking with me up front. One idea I have is to hide more low-level details. In specific, we can remove truncate and allocate interface and move their logic inside write(). close(or finish in my version) is needed to do the "de-allocate" job, but it should be implemented for Writer (in WriteExt trait), because we need the allocated size and the actual written size to de-allocate unnecessary parts.

ztelur commented 2 years ago

Sorry for doing nothing about this work during this period. Is there any change? I think that many code in engine.rs,log_file.rs and file_pipe_log dir will change for replacing file_builder. @tabokie

tabokie commented 2 years ago

No changes. You can do a simple replacement first like in your comment (https://github.com/tikv/raft-engine/issues/165#issuecomment-992338105), and work on the interface details later.

ztelur commented 2 years ago

Ok, I will create a draft PR or send a link to my code branch, then we discuss further on the code directly

ztelur commented 2 years ago

Hi, the above commit is my design. Most changes are in two files: file_pipe_log/log_file_with_file_system.rs and file_system.rs. There are still some errors in it, but maybe we can discuss further about it. Personal opinions about file abstraction, welcome to discuss.

/// FileSystem
pub trait FileSystem: Send + Sync {
    type Handle: Clone + Send + Sync;
    type Reader: Seek + Read + Send + ReadExt;
    type Writer: Seek + Write + Send + WriteExt;

    fn create<P: AsRef<Path>>(&self, path: P) -> Result<Self::Handle>;
    fn open<P: AsRef<Path>>(&self, path: P) -> Result<Self::Handle>;
    fn new_reader(&self, handle: Arc<Self::Handle>) -> Result<Self::Reader>;
    fn new_writer(&self, handle: Arc<Self::Handle>) -> Result<Self::Writer>;
}

/// LowExt is common low level api
pub trait LowExt {
    fn file_size(&self) -> Result<usize>;
}

/// WriteExt is writer extension api
pub trait WriteExt: LowExt {
    fn finish(&self) -> Result<()>;
    fn truncate(&self, offset: usize) -> Result<()>;
    fn sync(&self) -> Result<()>;
    fn allocate(&self, offset: usize, size: usize) -> Result<()>;
}

/// ReadExt is reader extension api
pub trait ReadExt: LowExt {}
tabokie commented 2 years ago

Looks good generally, you can remove the LowExt though, file_size accessor can be put into FileSystem. I think you should go ahead and file a PR next, so that I can review a compile-able version of it.

tabokie commented 2 years ago

Ping @ztelur, are you still working on this?

ztelur commented 2 years ago

I'll submit a PR this weekends.

tabokie commented 2 years ago

@ztelur Do you have time to upgrade raft-engine in TiKV repo and adapt it for the new interfaces?

ztelur commented 2 years ago

Ok