Update BIL read_bands to read from disk in consecutive chunks.

donm commented 10 years ago

Hi Thomas,

I noticed that the read_bands() loop for BIL files was looping over self.nrows one time for each band. This pull request has a version that loops over self.nrows just once, loading all bands per frame in one call to fromfile().

This obviously only matters if you bypass the memmap, which I'm doing by setting img._memmap = None.

The speed improvement for the file that I'm testing on is about proportional to the number of bands I'm loading.

tboggs commented 10 years ago

Hi Don,

Thanks for the patch. I may be misunderstanding the patch so please correct me if I'm wrong but it looks like the following line in your patch assumes that the band indices in the bands argument will always be monotonically increasing:

n_bands_to_read = bands[-1] - bands[0] + 1

That isn't always be the case though, since read_bands allows an arbitrary ordering of the band indices. For example, to display an RGB image of the sample data set, one can call

imshow(img.read_bands([30, 20, 10]))

I think that can be fixed in you patch by simply taking the min & max of the bands argument, then adjusting first_band_offset, n_bands_to_read, and shifted_band_indices accordingly.

Can you provide some more details on the performance test you ran (e.g. image dimensions, number of bands read, etc.). After reading your code, I'm not surprised that you are getting a significant performance improvement with your patch but I'm curious how the performance varies with the size of the image and number/indices of bands requested. As you've noticed, the default is to use the memmap interface, which is almost always faster. The code block in question is applicable when the memmap isn't used, which is typically either for a memory-limited system or a very large image.

If you haven't already, could you run a performance comparison for various numbers of bands on a very large image? A particularly useful case would be calling read_bands with a small number of bands but where both the first and last band are requested. If you like, I can give you a link to download the entire AVIRIS flight line that contains the SPy sample image chip. That flight line is 2678x614 pixels (~ 720 MB).

Regards, Thomas

donm commented 10 years ago

You're right, I wasn't thinking about bands in an arbitrary order. That made me realize that the original version probably also supports multiple instances of a band (read_bands([10,20,10])) but this patch won't. I'll fix those things and submit a new commit. (As a side note, maybe there should be a unit test for those cases.)

About running the comparison for a small number of bands but including the first and last bands--that's a good thought. I'll do that. Yes, please do send a link to the AVIRIS line so I can use the same data set as you for testing.

Are you getting good performance from the memmap interface for BIL and BIP files? For me memmap for BSQ was fast, but for BIL and BIP it was so slow as to be unusable, and that's why I was bypassing it and was looking into optimizing this routine. But if it's working well for you, maybe it's something with my numpy version or something.

tboggs commented 10 years ago

There is a unit test for the read_bands method (in spectral/tests/spyfile.py) but it doesn't currently check for non-contiguous, non-ascending, or duplicate bands so I will plan to add those.

I've emailed you a link to the flight line.

As for getting good performance with memmaps for BIL or BIP files....... It depends. If you're reading entire pixels at once (e.g., for classification), then memmaps for BIP files will usually work well but if you are reading entire bands from a BIP file, then not so good. If you need to access a lot of the data in an image file, then the best performance will usually be from loading the entire image into memory (if possible). Next will usually be using memmaps (if your usage matches well with the data file interleave). And lastly, explicitly reading the data from the image.

If your performance is really bad with memmaps, perhaps you are using up most of your RAM and hitting your swap disk, which would definitely hurt performance. If you are repeatedly using the same image file and call read_bands often, you might consider saving the image as BSQ and seeing if that doesn't give you better performance with the memmap interface.

donm commented 10 years ago

Thanks for the flight line. I ran some tests and the performance gain that I'm seeing is almost entirely due to iterating over nrows in the inner loop instead of len(bands). So the other change in this branch--reading from the lowest band to the highest band in one read operation--is not helpful, even when loading many consecutive bands. Plus, as you guessed, it's disadvantageous when loading only a few bands spread out over the range.

The improvement due to switching the order of the loops does seem to be significant, so I'll close this pull request and open a new one that just implements that change.

spectralpython / spectral

Update BIL read_bands to read from disk in consecutive chunks. #4