johannesvollmer / exrs

100% Safe Rust OpenEXR file library
Other
149 stars 22 forks source link

Read the input file in parallel #189

Open Shnatsel opened 1 year ago

Shnatsel commented 1 year ago

What can be improved or is missing?

OpenEXR format allows the file to be read in parallel. It is considerably faster than a single-threaded read all by itself, see https://github.com/johannesvollmer/exrs/pull/183#issuecomment-1374250749

It will also let us initialize buffers in parallel (until #186 becomes feasible) and allows performing pixel format conversions in worker threads more naturally (see #182) - AFAIK doing it with the current architecture would require a memcpy().

Implementation Approach

Memory-mapping is tempting, but is going to result in lots of very spooky UB the moment someone modifies the file we're reading. So we'll probably have to open a file descriptor per thread and seek it.

johannesvollmer commented 1 year ago

Would it be faster to not open multiple file handles, given no memory mapping is used? What would it even look like, I thought File can not be shared across threads?

johannesvollmer commented 1 year ago

Right now, memory mapping is supported by the library implicitly by allowing the user to pass any Read trait. Would memory mapping also benefit from parallel reading?

Shnatsel commented 1 year ago

Would it be faster to not open multiple file handles,

I don't think so. If you have a single file descriptor for everything, that means you cannot issue parallel reads. If nothing else, there is only one offset value for a given file descriptor, so you cannot issue reads from several places at once. Beyond that, a file descriptor is not safe to access in parallel, you'd need a mutex which defeats the point - see https://stackoverflow.com/a/823525

We'll just need to open several file descriptors, and Files, pointing at the same underlying file in the filesystem, and then independently seek and read from each descriptor. open() is cheap, doesn't matter if we do it once or 32 times.

johannesvollmer commented 1 year ago

I thought so too. I just thought it sounded like there was an alternative to opening multiple handles. Let's try that then :)

Shnatsel commented 1 year ago

Right now, memory mapping is supported by the library implicitly by allowing the user to pass any Read trait. Would memory mapping also benefit from parallel reading?

Yes, but it would be horrifically unsafe. Memory-mapped data changes in memory whenever the file is changed. That means that if you create an &str from a memory-mapped &[u8], the string is checked to be valid UTF-8 on creation, but the file may change later and turn into invalid UTF-8, violating the safety invariant. In fact, it's illegal to even have a &[u8] pointing to something that may change - the compiler assumes that &[u8] is immutable and generates code based on that. If that's violated, the very fabric of the compiler's vision of reality breaks down and anything is possible.

johannesvollmer commented 1 year ago

Okay, but if you use the mapped memory only through the interface of the Read trait, then this can't happen, right? Because in that case, data always copied and never referenced? Or does it mean that the Rust compiler simply can't handle memory mapping?

Shnatsel commented 1 year ago

Rust just doesn't allow creating a &[u8] over a memory map, which is required for the Read trait.

You can take a raw pointer to the memory map and ptr::read_volatile into a buffer. However, mmap is only useful when you want to avoid copying the data - in all other cases you can just read from a file descriptor into a buffer, and the performance will be exactly the same.

So to actually take advantage of mmap, you need to either adapt basically all your code to use read_volatile on raw pointers instead of creating slices, or find some platform-specific way to make sure nothing else modifies your file while it's memory-mapped.

johannesvollmer commented 1 year ago

thanks for that explanation :) good to know

etemesi254 commented 1 year ago

I'm wondering how this would happen, from my docs perusing , it seems the crate wraps a BufferedReader from which it gets it's bytes from, they don't implement clone and the same one can't be sent to multiple threads , creating multiple bufreaders to the same underlying reader is explicitly mentioned in the docs to cause data loss.

Shnatsel commented 1 year ago

When reading from a file, you'll need to open multiple File handles (backed by a file decriptor) and wrap each one in its own BufReader.

For reading from an in-memory buffer, you can just send the &[u8] references to different threads - under the hood that's just pointers.

AFAIK there's no way to implement this for a generic buffered reader. There is, for example, no reasonable way to read a stream from the network in parallel, let alone seek the network stream. So if that's a use case you'd like to support, reading needs to be single-threaded in this case.

johannesvollmer commented 1 year ago

meta data should be read with buffered readers, since a lot of tiny read requests are made. The pixel data, which will be read in parallel, will not benefit from any buffering

johannesvollmer commented 1 year ago

the cloning of the byte source will have to be a custom abstraction. the File type has a function try_clone, but there is no trait with that function. We'll have to add our own simple trait for cloning a byte source that we implement for Files and for other readers

or alternatively we abstract over the ability to create a new reader from scratch, by remembering the path of the file, and calling File::open repeatedly

those individual readers can then be buffered where it makes sense

etemesi254 commented 1 year ago

And what about data passed as a slice? Won't that remove parallelism from it?

Ideally what you want is a trait, I'll call it ParalellReader, implemented for File, BufferedFile, slices (or anything that derefs to a slice), with. Method that allows whole cloning,

For a file, create a new file handler(we can't use clone because change in one file seek affects all other instances of files) for data using a cursor, create a new cursor with the same referenced data and returns it.

Also the trait should implement independent seeking, i.e each independent clone should allow seeking without affecting the others

I'm not sure if this is possible I've never really done it as I'm just speaking what I think may work.

johannesvollmer commented 1 year ago

yes! I'm pretty sure that calling File::try_clone on will create a new handle with an independent seek position. But manually creating new ones will be easier anyways I think