Question about misc/spectral/transforms.py

valpasq commented 9 years ago

So I am trying to generate a set of index images that I can use in QGIS, one BGW image and one with all other indices included in transforms.py.

I am using the following syntax:

l_file=`find . -name 'L*stack' -type f`
for l in $l_file; do
    echo "Running on file:"
    echo $l
    extract=${l:0:45}
    name=${extract}_BGW.tif
    echo "Name of BGW stack:"
    echo $name
    echo "Executing code..."
    ~/Documents/misc/spectral/transforms.py \
    -v $l $name brightness greenness wetness
done

l_file=`find . -name 'L*stack' -type f`
for l in $l_file; do
    echo "Running on file:"
    echo $l
      extract=${l:0:45}
    name=${extract}_index.tif
    echo "Name of index stack:"
    echo $name
    echo "Executing code..."
    ~/Documents/misc/spectral/transforms.py \
    -v $l $name evi ndvi nbr ndmi   
done

The script runs without error, but when I open the resulting files, Brightness, Greenness and Wetness, as well as EVI all come out as values scaled to 10000, but NDVI, NDMI and NBR come as 0 or 1.

I am guessing this more than likely has to do with scaling, and I do see there are data types defined in the transforms.py code, _np_dtypes = ['uint8', 'uint16', 'int16', 'uint32', 'int32', 'float32', 'float64']

Do I need to set a different datatype? If so, could you give me an example of the syntax? Or is something else at play here? Ideally, I'd like to get an output image with all index values scaled in the same manner as the input data (x 10000), just like I get for BGW/EVI, might be nice if that was an option if not the default.

ceholden commented 9 years ago

Sorry about that -- it looks like the scaling factor isn't applied to any of the normalized differenced equations. You're seeing it as 0 or 1 because it's coerced to an integer datatype. Meanwhile,

The tasseled cap indices are scaled by 10,000 because the input data are and we're just doing addition.
The scaling for EVI alters the L parameter. It's taken as 1.0 when reflectance are [0, 1], but when the data are scaled it's equal to the scale factor (e.g., 10,000). Using an appropriate change to L results in EVI between 0-1, but then I re-apply the scale factor to get back to the input data scaling.

I guess technically there should be some input and output scaling options for 100% flexibility. Right now you couldn't, for instance, use reflectance x 10,000 and get out EVI between 0-1. Not sure why one would want that, but it's easy enough to accomplish.

Here's what I'm proposing -- can you let me know if it makes sense?:

Rename --scaling to --input_scaling
Add --output_scaling
For tasseled cap, multiply calculation by ratio of --output_scaling / --input_scaling (e.g., BGW * 1 / 10,000 if you wanted to remove the scaling)
For EVI, I'll apply the --input_scaling as L and then multiply by --output_scaling

valpasq commented 9 years ago

I think I'm following.... In your example,--input_scaling=10000 and output_scaling=1 gives you BWG on a scale of 0-1, calculated from reflectance x 10000.

That makes sense.

So, if you wanted to keep everything scaled by 10000 so you don't have to deal with floating point, would you just set --input_scaling=1 and --output_scaling=1?

I get the flexibility.

What would you recommend as a default? I tend to like the x 10000 data values for BGW, would be nice to have the index values on a comparable scale, though I know most of the time index values are scaled from 0 to 1.

ceholden commented 9 years ago

For input scaling I mean, "what are you reflectance data in the input image scaled by?" So to get indices that are scaled by 10,000 from input reflectance that are also scaled by 10,000, both scaling parameters would be 10,000. I'd aim for 10,000 to be the default for both since it's so common and reduces data sizes through using int16 data types.

The input scaling number is data dependent while the output scaling decision is up to the user.

Does that make more sense? I'd love some suggestions for the option argument names and for the descriptions as well. I can add some prose like above to the CLI help message as well.

valpasq commented 9 years ago

Yes, that makes perfect sense now. Thanks.

I think your suggested argument names, --input_scaling and --output_scaling, work well. I also agree that 10000 is a good default--that's what I was expecting to see.

I definitely think you should add something like the description you gave me above. Here's a crack at how I would write it:

--input_scaling Scaling of input dataset (dataset-determined). Note: Landsat data downloaded from ESPA typically scaled by 10000.

--output_scaling Desired scaling for output dataset (user-determined), e.g. 10000 for values comparable to ESPA inputs; 1 for floating point index values scaled from 0 to 1.

Something like that?

ceholden commented 9 years ago

Hi @valpasq -- would you mind giving the latest from master a pull and a try? As far as I can tell I've implemented what we talked about and it should be working. The VIs and tasseled cap I calculated seem to be within the right range of data given the input/output scaling factors.

Two other changes have been added:

transforms.py --changelog will show you a history of what's changed in each version of this script (385a31068eb545a410cc81f0b58df016738b1fa7)
Previously the script would output rasters with the transforms / spectral indices in an order not consistent with the <transforms>... input argument order. I've corrected this (2b67bab5e3b10de3e5bb25d7f93a35f14edbbfd9)

valpasq commented 9 years ago

Pulled the latest and greatest, and it seems like everything is working properly...

Currently running the following to generate index stacks of EVI, NDVI, NBR, and NDMI for /projects/LCMAP/p035r032/images/:

l_file=`find . -name 'L*stack' -type f`
for l in $l_file; do
    echo "Running on file:"
    echo $l
      extract=${l:0:45}
    name=${extract}_index.tif
    echo "Name of index stack:"
    echo $name
    echo "Executing code..."
    ~/Documents/misc/spectral/transforms.py \
    -v $l $name evi ndvi nbr ndmi 

done

I didn't specify the scaling inputs since I wanted to use the defaults, --input_scaling=10000, ---output_scaling=10000, and the handful of output files I checked so far look to be scaled correctly.

Also, I had noticed the issue with the indices not being in the order specified. It actually took me a while to figure out what was what in the output because of that--ended up running each index as a separate file to troubleshoot. Sorry for not mentioning in my initial issue. Thanks for correcting! Seems like the indices are all in the specified order now.

Thanks for the fix!

ceholden / misc

Question about misc/spectral/transforms.py #2