meireles / spectrolab

Dealing with spectral data in R
16 stars 11 forks source link

Possible bug in match_sensors function #11

Closed serbinsh closed 4 years ago

serbinsh commented 4 years ago

I am attempting to replace my current SVC processing workflow to leverage this package to achieve all of the needed processing steps. Traditionally I would first do the overlap matching/correction in the SVC software first and export the *_moc.sig files. Then in R run the QA/QC and averaging steps based on specific criterion

However, I have been migrating more toward a full R-based workflow (preferred) and using this package to avoid writing my own matching code. However, I am running into an issue when attempting to correct the overlaps in the VNIR-SWIR regions. I can successfully correct the first splice (~986), though I noticed slightly different results usign 1 or 2 for "fixed_sensor" with the same splice wavelength and interpolation

However more troubling is the issue I am finding with the SWIR1/SWIR2 splice. If I only correct the first part of the spectrum and leave the second splice (~1910) alone I get a result that generally matches the SVC+R workflow

Averaged spectra from SVC+R workflow SVC_averaged_linear_interpolated_spectra.pdf

Full R based workflow just correcting first splice Overlap_removed_and_matched_SVC_spectra

** Note the magnitude of the SWIR 2 spectra.

Now if I run:

  fixed_spectra <- match_sensors(spec.files, splice_at = c(vis_splice_wavelength,
                                                           swir2_splice_wavelength), 
                                 fixed_sensor = 2,
                                 interpolate_wvl = 5)

where I correct both splices I get

Overlap_removed_and_matched_SVC_spectra

** Note the significant increase in reflectance values in SWIR2. That I think is erroneous.

Also if I just try to correct the first splice and the second splice separately, I get a very cryptic error when trying to correct SWIR1/SWIR2 transition.

Error in wavelengths(x)[idx_rm] <- bogus : invalid subscript type 'list'

Any ideas?

serbinsh commented 4 years ago

BTW

> c(vis_splice_wavelength,
+   swir2_splice_wavelength)
[1]  990 1906

Basically I am trying to sort out if I am incorrectly using the function or if I should tweak options. Or maybe there is a bug?

serbinsh commented 4 years ago

One other note,

  rm(subset_waves); subset_waves <- c(1100,1910)
  spec_sub_swir2 <- spec.files[ , wavelengths(spec.files, subset_waves[1], subset_waves[2]) ]
  wavelengths(spec_sub_swir2)
  plot(spec_sub_swir2)
  spec_sub_swir2_match <- match_sensors(spec_sub_swir2, splice_at = 1901, fixed_sensor = 2,
                                            interpolate_wvl = 5)

gives me

Error in wavelengths(x)[idx_rm] <- bogus : invalid subscript type 'list'

but if I do

  rm(subset_waves); subset_waves <- c(1100,1900)
  spec_sub_swir2 <- spec.files[ , wavelengths(spec.files, subset_waves[1], subset_waves[2]) ]
  wavelengths(spec_sub_swir2)
  plot(spec_sub_swir2)
  spec_sub_swir2_match <- match_sensors(spec_sub_swir2, splice_at = 1901, fixed_sensor = 2,
                                            interpolate_wvl = 5)
  plot(spec_sub_swir2_match)

  wavelengths(spec_sub_swir2_match)
  spec_sub_swir2_match_as_df <- as.data.frame(spec_sub_swir2_match, fix_names = "none", metadata = FALSE)
  temp <- as.data.frame(spec.files[ , wavelengths(spec.files,1912, 2512) ])
  combined_dataframe <- cbind(spec_sub_swir2_match_as_df,temp)

  plot(unlist(combined_dataframe[1,2:dim(combined_dataframe)[2]]), type="l")

I get this

Rplot

So basically the magnitude stays correct but I get an erroneous wavelength response at the transition?

serbinsh commented 4 years ago

OK looks like I can create a kludge to get the results to very closely match the overlap removal/splice correction completed in the SVC software to that using my script leveraging spectrolab.

Here is the SVC version SVC_averaged_linear_interpolated_spectra.pdf

Here is the Spectrolab version

SVC_overlap_corrected_linear_interpolated_averaged_spectra_summary.pdf

Here is a single spectra scatter plot

Screen Shot 2020-04-22 at 4 14 24 PM

here is the ugly code to achieve this:

  subset_waves <- c(350,1890)
  spec_sub_vnirswir1 <- spec.files[ , wavelengths(spec.files, subset_waves[1], subset_waves[2]) ]
  plot(spec_sub_vnirswir1)
  wavelengths(spec_sub_vnirswir1)
  spec_sub_vnirswir1_match <- match_sensors(spec_sub_vnirswir1, splice_at = c(vis_splice_wavelength), fixed_sensor = 2,
                                            interpolate_wvl = interpolate_wvl[1])
  plot(spec_sub_vnirswir1_match)
  #wavelengths(spec_sub_vnirswir1_match)
  spec_sub_vnirswir1_match_as_df <- as.data.frame(spec_sub_vnirswir1_match, fix_names = "none", metadata = FALSE)

  #### correct SWIR1/SWIR2 transition
  rm(subset_waves); subset_waves <- c(1891,2512)
  spec_sub_swir2 <- spec.files[ , wavelengths(spec.files, subset_waves[1], subset_waves[2]) ]
  plot(spec_sub_swir2)
  # remove overlapping bands in SWIR2
  swir1swir2_splice_wave <- wavelengths(spec_sub_swir2)[which(diff(wavelengths(spec_sub_swir2)) < 0)]
  swir1swir2_splice_index <- which(diff(wavelengths(spec_sub_swir2)) < 0)
  spec_sub_swir2_as_df <- as.data.frame(spec_sub_swir2, fix_names = "none", metadata = FALSE)
  dims <- dim(spec_sub_swir2_as_df)
  spec_sub_swir2_as_df <- spec_sub_swir2_as_df[,2:dims[2]]
  rm(dims); dims <- dim(spec_sub_swir2_as_df)
    spec_sub_swir2_as_df2 <- spec_sub_swir2_as_df[,c(1:(swir1swir2_splice_index-4),seq(swir1swir2_splice_index+1,dims[2]))]
  plot(as.numeric(names(spec_sub_swir2_as_df2)),spec_sub_swir2_as_df2[5,],type="l")
  waves <- as.numeric(names(spec_sub_swir2_as_df2))
  spec_sub_swir2_2 <- spectra(reflectance = spec_sub_swir2_as_df2, 
                              wavelengths = waves, names = names(spec.files))
  temp_swir_correction <- match_sensors(spec_sub_swir2_2, splice_at = c(1898), fixed_sensor = 2,
                                        interpolate_wvl = 1)
  plot(temp_swir_correction)
  spec_sub_swir2_2_df <- as.data.frame(temp_swir_correction, fix_names = "none", metadata = FALSE)

  combined_dataframe <- cbind(spec_sub_vnirswir1_match_as_df[,2:dim(spec_sub_vnirswir1_match_as_df)[2]],
                              spec_sub_swir2_2_df[,2:dim(spec_sub_swir2_2_df)[2]])
  dim(combined_dataframe)

  ## put back into a spectra object
  waves <- as.numeric(names(combined_dataframe))
  fixed_spectra <- spectra(reflectance = combined_dataframe, 
                           wavelengths = waves, names = names(spec.files))
  plot(fixed_spectra)

Basically correct the first splice then remove the overlapping bands in teh SWIR1-SWIR2 transition by hand then run the match_sensors code.

The remaining differences come down to the # interpolation wavelengths for the first splice where it requires more to match SVC exactly.

So narrowing down things a bit but I havent been able to figure out what my be driving the issues I am having when i look over the match_sensor function code

meireles commented 4 years ago

@serbinsh that doesn't look good. I'll investigate this further today. I'll try to reproduce the behavior with my own SVC data but it would be great f you could sent me your .sig file.

serbinsh commented 4 years ago

@meireles thanks! Sure, here is a link to some data:

https://www.dropbox.com/sh/e1o92i98932xb0v/AACerQQ4mvbeEba8hXDSjBcPa?dl=0

meireles commented 4 years ago

@serbinsh Thanks!

I see quite a few issues in the code (and a bunch of warnings to myself stating that the function wasn't production-ready!).

The function is shifting one sensor's reflectance to match the other sensor's refl value by multiplying it by a scalar. Since the reflectance values between the matching regions of the SWIR1 & 2 are small, we can end up with scalars that affect large reflectance values disproportionally (say around 2200nm).

Here is the question. Should the splicing function be additive instead of multiplicative? Here's an example.

Rplot01

serbinsh commented 4 years ago

@meireles Not sure its much help because the code is old and I have been meaning to spend time updating and improving for many years, however here is the splice correction I put together for ASD files many years ago: https://github.com/serbinsh/R-FieldSpectra/blob/8c06bd8b1bae0fb6a08a834a232c34fe5f88f558/R/overlap.corrections.R#L277

slightly easier because there werent overlapping bands, just a splice correction. In that code you will see the general idea was to select a couple of wavelengths around the splice, find the multiplier (c1), and then multiply the spectra by that multiplier. However looking at your examples above I would want some middle ground between multiplicative and additive. I think the multiplicative looks most correct except the SWIR 2 "hump" gets amplified. However additive is wrong because 1) vis <0 and 2) SWIR2 gets an average albedo shift up? So the values scale together but incorrectly.

I will try to experiment a little too and see if I can better understand why correction is causing these responses. Clearly the correction works well for the first splice in the NIR. Only thing I was unsure about there was how you selected which wavelengths to keep? I think Tom always told me to prefer the first wavelengths of the second detector over the last wavelengths of the first detector which should prefer better SNR.

serbinsh commented 4 years ago

Another reason I am skeptical of the full additive approach is that it does not match what I get using the SVC software and doesn't "look" correct to me based on looking at a lot of spectra in the past.

Again here is the result using the SVC software for overlap removal and splice correction, but then R to run the averaging

[Uploading SVC_averaged_linear_interpolated_spectra.pdf…]()

Would it be helpful to provide the SVC moc files for the same spectra files I shared with you? Could use those as a comparison. I will put then in the dropbox

I also placed the meta-data I extracted from the SVC files. The "Factor" column shows you what that software determined to be the factors to multiply, I think, the VIS, SWIR1, and SWIR2 detectors by to align them

meireles commented 4 years ago

@serbinsh Thanks for that feedback and for those extra files. There are a lot of memories coming back now about this issue...

I still have a hard time believing that the same scaling factor is (or should) be applied across all the bands in a sensor when scaling it. I think that, regardless of which sensors we splice first or whether the reference is the start of the 2nd and 3rd sensor instead of the tail of the 1st and 2nd, we will end up with wonky reflectance values.

I'm probably missing something but look at the mock example below:

########################################
# mock spectra with 3 sensors
########################################

S = data.frame(n = rep(c("A", "B", "C"),  c(5, 10, 9)),
               x = c(1:5, 5:14, 14:22),
               y = c(0.5, 0.55, 0.60, 0.58, 0.60,
                     0.62, 0.54, 0.4, 0.3, 0.2, 0.15, 0.1, 0.07, 0.06, 0.05,
                     0.03, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3))

########################################
# Scale sensors A and B; hold 2 fixed
########################################

U = S
# reflectance of overlap regions
# the absolute difference is 0.02 in both cases
overlap_1 = U[U[ , "x"] == 5, "y"]
overlap_2 = U[U[ , "x"] == 14, "y"]

# Compute scalars
scalar_SA = overlap_1[2] / overlap_1[1]
scalar_SB = overlap_2[1] / overlap_2[2]

# Scale
U[U$n == "A", "y"] = U[U$n == "A", "y"] * scalar_SA
U[U$n == "C", "y"] = U[U$n == "C", "y"] * scalar_SB

########################################
# Plot
########################################

cols = rep(c("purple", "green", "orange"), table(S$n))
plot(S[ , "x"], S[ , "y"], pch = 21, bg = cols, ylim = c(0, 1))
lines(U[ , "x"], U[ , "y"], col = "black", lty = 2)

text(x = c(5, 14), y = c(max(overlap_1), min(overlap_2)), pos = c(3, 1),
     labels = paste("scale:", format(c(scalar_SA, scalar_SB), digits = 3)))

legend("bottomleft", fill = c("purple", "green", "orange"),
       legend = c("A", "B", "C"), bty = "n", title = "Data")

Rplot

I think that the spectral evolution folks told me at some point that their scale factors are downweighed as we move away from the splicing site. How exactly that happens is a mystery.

Any thoughts on this?

meireles commented 4 years ago

@serbinsh Can you try to re-install spectrolab from the splice_sensor_bug branch?

From what I've seen, the SVC software:

  1. Matches (and scales) radiance instead of relative reflectance
  2. Does not scale the spectra between the SWIR1 and SWIR2. It just seems to get rid of the overlap and call it good.
  3. Interpolates the scaling factor for the VIS such that the effect of the factor decreases towards the UV.

I've hacked those behaviors in a version of match_sensors in a separate branch. I can work on a permanent fix once we probe the function some more. I will also have to make sure that the function works for ASD and PSR data.

Here is a plot showing the unspliced spectra in black, the SVC corrected spectra in orange (moc file), and the spectrolab corrected spectrum in blue. Rplot.pdf

serbinsh commented 4 years ago

@meireles I got pulled away from this for a little bit but back to trying to help sort this out. I agree that it seems the SVC software does not seem to adjust the SWIR1/SWIR2 region, but I am not sure why. Because of this you can still get discontinuities between the detectors. Whats strange is that if you "remove" when collecting the spectra it seems to do a better job than post-processing. Strange.

I will note this new branch does work better for my SVC files in that I can correct the whole spec at once, where before I had to split between VNIR and SWIR to run match_sensors, otherwise I was getting strange error messages.

I am going to keep exploring to see if I can find anything else.

BTW what options/settings did you use for match_sensors? The example below is from data earlier this year in Panama and you can see I cant seem to clean up the transition?

Screen Shot 2020-05-18 at 9 18 22 AM

Screen Shot 2020-05-18 at 9 20 36 AM

Screen Shot 2020-05-18 at 9 18 07 AM

serbinsh commented 4 years ago

OK, switching to the data you have been looking at I am playing with some of the options. Below is an example for spec "BNL13002_002.sig". Red is spectrolab correction (match_sensors) with options

spec <- match_sensors(spec.files[5], splice_at = c(990, 1901), fixed_sensor = 1, interpolate_wvl = c(5, 1))

Blue is the SVC software "moc" file. SWIR1/SWIR2 overlaps exactly, but a slight difference in the VNIR transition, which is like an interpolation and splice wavelength difference.

So It seems we can basically match the SVC software, but I dont know how I feel about how it doesnt do any correction in the SWIR2 particularly above where you can see the transition is larger leaving an artifact.....hmmm

Screen Shot 2020-05-18 at 9 27 32 AM

Screen Shot 2020-05-18 at 9 34 14 AM

serbinsh commented 4 years ago

One more example from the 2017 Alaska data on the Dropbox. This has black original spectra, blue SVC MOC file, and red spectrolab corrected spectra just to provide a full example based on my testing of the new branch

Screen Shot 2020-05-18 at 10 12 21 AM

So with some tweaking of the splice wavelength and interpolation options we can basically get an exact match. However there is still that step-change in the SWIR1/SWIR2

meireles commented 4 years ago

Thanks for looking into this. I will refactor the match_sensors function a little bit. I look forward to hearing what Spectra Vista has to say about the swir1-swir2 transition.

serbinsh commented 4 years ago

@meireles Thanks. I will let you know the outcome when I am able to find a time to talk with Larry

serbinsh commented 4 years ago

@meireles I still never had a chance to talk with larry, did the "splice_sensor_bug" branch ever get merged? i dont see that in the commits list and was wondering if I should switch over to master yet? Or keep using this?

meireles commented 4 years ago

@serbinsh I've been working on cleaning up a ton of other things on master before merging the new match_sensor function. This should be done in the next week or so. FYI, many of those changes are not backward compatible. BTW, I had a few other questions for you but I'll resend them in an email.

serbinsh commented 4 years ago

@meireles OK so some feedback. I started to switch my code over from using spectrolab source from

devtools::install_github("meireles/spectrolab", ref = "splice_sensor_bug", force = TRUE, dependencies = T)

to

devtools::install_github("meireles/spectrolab")

But then realized changes in "spectra", how you now get the reflectance matrix (was spectrolab::reflectance, now its spectrolab::value) are breaking changes in my workflows. As such I haven't yet converted my code over. I will work on updating my scripts to reflect all of the breaking changes in the package and then let you know how it works.

serbinsh commented 4 years ago

@meireles by chance is there a CHANGELOG or summary of what functions have changed in master?

serbinsh commented 4 years ago

Sorted out the modifications needed to make the latest release to work for me. Thanks! All good now.