pedrocr / rawloader

rust library to extract the raw data and some metadata from digital camera images
GNU Lesser General Public License v2.1
304 stars 53 forks source link

Tile or Scanline loading support #35

Open ryanstout opened 3 years ago

ryanstout commented 3 years ago

Forgive a noob question, I'm fairly new to rust. Looking at the source, I don't think it is, but figured I would ask. Is it possible to access raw image data without full loading the image into ram? (for example to load the image in tiles or by scanline or something, at least for .nef, .cr2, and .awr). I'm stacking raw files in different ways, and I can only load part of them in ram at once. (due to limited ram) Any help would be appreciated. If it's not possible, is it something that could be done with an easy modification? (I could probably fund some development if its something thats not a huge change) Thanks!

pedrocr commented 3 years ago

Rawloader doesn't have any support for it and it's not something that can be done generally. There are formats where the last pixel depends on the values of every other pixel in the image and so decoding always has to decode the image until that point. Other formats are simpler and it's possible to go directly to a given block, line or even specific group of 2 or 3 pixels. A simple way to spot this in the code is if threaded decoding is done and how. From your examples ARW would be fully possible, CR2 hard and NEF not possible at all.

Depending on what you're doing that may not be what you want though. If you need RGB values you need to demosaic first. At a guess it's probably a better idea to do a pre-processing step where you chunk the images in however blocks you want without requiring special decoding which will always be a corner case that's hard to support. Something like:

  1. Define N processing blocks that are defined as a specific region of the image (line-based perhaps)
  2. Decode every image in sequence, preprocess it freely with normal code and then split it and append to a series of N files the blocks of that image
  3. Process the N files individually stacking and doing whichever calculations you care about producing an output file
  4. Merge the N output files into a single final output image

I'm curious about what kind of stacking you're doing. Astrophotography? imagepipe takes the output of rawloader and creates RGB images. That may be useful to you too:

https://github.com/pedrocr/imagepipe

Stop by #chimper on irc.freenode.org if you want to chat about this or other raw decoding/processing stuff.

ryanstout commented 3 years ago

Thanks so much for the info. Currently I load up each image, then write out tiles to disk and then read those tiles in during the stacking. (Similar to what you described) We're doing exposure, focus stacking, and a few others. For us, we don't need to load scanlines or whatever out of order, its really just about not having the whole image in ram at once. (So we could for example take 3 images and process them top to bottom in sets of scanlines (at the DCT boundaries or something.) Is that something that would be possible I'm assuming you were referring to huffman coding on when you said "the last pixel depends on the values of every other pixel", or is there another limitation. Thanks!

pedrocr commented 3 years ago

It's not impossible, but it would be hacky. Example for the formats you listed in increasing scale of the size of the hack:

What you described that you're already doing seems the best solution, as it avoids having to hack decoders and doesn't seem like it has too big of a downside. Maybe a bit of intermediate results saved to disk but that doesn't sound too bad.

ryanstout commented 3 years ago

@pedrocr unfortunately, the disk read/write time adds up. (since they get paged out of ram) I'm not sure if its any easier, but what we really need is just something that yields scanlines or tiles, then we would basically open X number of photos at once and walk through them together in sets of scanlines or tiles. I'm guessing not, but let me know if you would be interested in some paid work to add support for that. ryan at witharsenal.com Thanks!

ryanstout commented 3 years ago

@pedrocr Just wanted to follow up. Really we just need to be able to decode the image from start to finish in some sort of chunks, so we don't load the whole image into ram at once. (Our code is doing a stacking operation across multiple photos, but only looking at the same small area of each photo at once) I would think it could keep the decoding state around and do it all as a single pass. The actual tile shape doesn't really matter for our use case. (and just scanlines for example would work) Would it be possible to abstract things so instead of writing directly to a location in memory once a photosites value is decoded, it could write to something that could manage a tile buffer, then pass it to our code to process once that tile is done, then start decoding to the top left of the tile (for example)? (Sorry, I realize this might not be of use to anyone but me :-) Happy to pay for dev time at a good rate if its something you would be interested in working on it. Thanks!

pedrocr commented 3 years ago

@ryanstout This isn't a very mainstream feature but it's not unreasonable and can fit in the API without bothering any other users. It seems even easy to have an API that has a generic implementation for every format that can be decoded threaded, which is actually most of them, so it doesn't seem like too much of a maintenance burden although it requires some rework.

I'm not really setup to do consulting work but the project is missing at least a CR3 decoder (see #23) as well as a decoder for the newest Fuji encodings. Assuming those are things that are important to you as well having that kind of contribution back would be incentive to work on things that are specific to your use case.