LinearTapeFileSystem / ltfs

Reference implementation of the LTFS format Spec for stand alone tape drive
BSD 3-Clause "New" or "Revised" License
249 stars 73 forks source link

io_uring support #468

Open wangdbang opened 3 months ago

wangdbang commented 3 months ago

Is it possible that LTFS can support io_uring? So that LTFS can support Linux native async IO, would get a performance benefit.

piste-jp commented 3 months ago

Sorry, I cannot follow you. Can you describe your idea more specific?

I'm not sure about "async I/O" that you are talking... From your description "Linux native async IO", I guess it is I/O between LTFS and tape drive. Is is correct?

If so, I don't think it is reasonable for tape device at all. In tapa drive world, sequence of command and data is the most important because we cannot specify address to write in a SCSI WRITE command. If we use async I/O, LTFS might misdetect block number on tape.

If you are saying about data transfer between an application and LTFS. LTFS uses async I/O in the unified I/O scheduler. Written data from application is just copied to the internal buffer and multiple written data are gather to a 512KB block (FUSE calls write() callback with 32KB MAX even if you write a 1MB data...). Once 512KB block is constructed, write (to tape) request is requested into the internal I/O request queue and process one by one.

wangdbang commented 3 months ago

OK. another request, we can see FUSE support the io_uring, https://lwn.net/Articles/932079/. Is it possible that LTFS can support it. It could avoid the memcopy between user space and kernel space.

piste-jp commented 3 months ago

I'm not sure I understand the description correctly. But it looks the story in the description is completed within FUSE. And no change is needed for LTFS. I think the only thing we need to do is enable this feature in FUSE (by option? I don't know how to do it).

But I don't know which part is improved by this feature.

Do you have any idea?

wangdbang commented 3 months ago

I need more information, and will see more detail in fuse project about this, maybe we need change the interfaces called from libfuse.

piste-jp commented 3 months ago

Oh, that reminds me they might be read_buf() and write_buf() callbacks, I guess. Are they the things you are talking about?

I don't know they can be used in LTFS actually.

First of all, technically, it might be able to use for write side. But as I mentioned before we need to construct 512KB block for tape write. So no copy from application to sg device is impossible. The only thing we can do is remove a copy within FUSE. In other words, Call tree is

  1. Appl
  2. -> syscall
  3. -> FUSE (no buffer copy here)
  4. -> LTFS's FUSE callback (copy data from FUSE to LTFS internal buffer)
  5. -> Construct SCSI req (no copy historically)
  6. -> issue ioctl() against sg (internal copy might be suppressed if direct io is enabled see Wiki)

For read side, it is really difficult to use because we have one block (512KB) cache. At this time, all examples are connected to a file on under hosted filesystem, I don't know we can provide a internal memory block with no care (I don't believe this...). We need to have a study what happens in the FUSE layer on read side.

Finally, I don't understand the benefit of this calls. At this time, we can drive the tape drive at max speed. And multiple file access (concurrent file access) in LTFS always degrade the performance. I think the only benefit is reducing usage of bus bandwidth and temporary tiny memory. On the other hand, these calls have a chance to make data corruption. So I think balance between effort and reward is not matched.

Of cause, I can review any code if someone provide a code with careful test. But priority is not high at this time in my mind.

wangdbang commented 3 months ago

OK, I agree with you.