srlabUsask / crhmcode

GNU General Public License v3.0
7 stars 4 forks source link

Errors in calculation of net solar radiation #20

Closed KevinShook closed 3 years ago

KevinShook commented 3 years ago

crhmcode is returning differing values for net (net solar radiation) from the Borland version. Both the older and new versions of crhmcode give the same results. In this example, the crhmcode values are < the Borland values. all_net_rad_May_27_29_1995

A few days previously, the crhmcode values are > the Borland values: net_rad_May_21-27_1995

The values of net are determined in Classnetall. I can't find any difference in the code between the versions. Also, the forcings seem to be the same. The incoming radiation components are the same: Borland_crhmcode_interval_radiation

The albedo values are also the same: albedo_May_21_27_1995

as are the number of sun hours: sunmax_sunact_May_21_27_1995

I don't think that the issue is another if threshold. Note that the if tests are based on values of hru_SunAct and SunMax which are much greater than zero.

 if(hru_SunAct[hh] > 0.0 && SunMax[hh] > 0.0)

Unfortunately we can't see the internal variables in the Borland code. I'll see if I can get Logan to debug the code to give us internal variables on some of the days in question.

jhs507 commented 3 years ago

Are you gathering data using the same project as for #10?

KevinShook commented 3 years ago

Pretty much, although I used a SWE threshold of 1 mm. This one calcualted the radiation components radiation_model.zip This one had calculaed the net radiation Evap_model.zip

jhs507 commented 3 years ago

What HRU's are you seeing this difference in?

KevinShook commented 3 years ago

I was looking at HRU 11

jhs507 commented 3 years ago

I am not seeing differences in net, netD, or RnD that are outside of our tolerance of 5% or 0.0001.

I checked hru's 2, 3, 11 for the first two years of the simulation.

What years are you examining?

KevinShook commented 3 years ago

Sorry, the examples I plotted above are from 1995. You can see that in many years, we get very good agreement between both models. However in some years, chrmcode is underestimating runoff, probably because the evaporation is too large new_crhmcode_Borland_annual_discharge

jhs507 commented 3 years ago

And you believe that the evaporation is too large because our radiation values are too large as well?

KevinShook commented 3 years ago

Probably - haven't checked it yet, but it's the most likely scenario. I'll do a more complete water balance to be sure.

jhs507 commented 3 years ago

Alright I will investigate the overflow, evap, and rad, variables around 1960-1968 and then see if I can figure out what is causing it.

KevinShook commented 3 years ago

I plotted the daily evap for HRU 11 over the whole run period. Most of the values are very similar, but there are quite a few outliers image

The crhmcode mean daily evap is slightly smaller than the Borland evap for HRU 11

      date               Borland          crhmcode     
 Min.   :1960-01-01   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:1971-08-06   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :1983-03-11   Median :0.7606   Median :0.7227  
 Mean   :1983-03-11   Mean   :2.1168   Mean   :2.1071  
 3rd Qu.:1994-10-14   3rd Qu.:4.4962   3rd Qu.:4.4766  
 Max.   :2006-05-20   Max.   :8.0753   Max.   :8.0753
jhs507 commented 3 years ago

I have found that the albedo is not matching for hru 11 I assume that this will affect the evap?

KevinShook commented 3 years ago

yes it will - this is important.

KevinShook commented 3 years ago

I got Logan to output the values of some of the internal variables from the Borland netall for HRU 11 for May 21-28, 1995. netall_HRU11_May_21-28_1995_run2.txt

KevinShook commented 3 years ago

I output the daily evaporation, radiation and albedo for days where the Broaldn evaporation was non zero and > 1.1 x the crhmcode evaporation and vice versa. The biggest thing that stands out is the number of days where one of the models was setting the albedo to 0.11, i.e. to soil rather than snow. HRU11_daily_evap_netrad_albedo.zip

I wonder if this might be due to a differing precision when evaluating

    if(SWE[hh] > 5.0) {
      Albedo[hh]    = Albedo_snow[hh];
      winter[hh]    = 1;
    }
    else{
      Albedo[hh]    = Albedo_bare[hh];
      winter[hh]    = 0;
    }

in classalbedo

jhs507 commented 3 years ago

That certainly is possible. Anytime we have comparisons with floating points we are going to have issues as comparisons on floating points are never exact.

Especially with moving from single to double precision.

jhs507 commented 3 years ago

I implemented the instrumentation that Logan did and analyzed the same time period. instrumented_results.zip

Results reveal that the instrumented variables are identical save for precision error on that time period but the radiation values do not match.

jhs507 commented 3 years ago

I ran a test for the year of 1995 to see if that swe comparison was not being evaluated the same. I used the winter varaiable as a proxy to see what branch was evaluated.

For that year I did not see any times where borland and crhmcode differed I will try with a longer timeline on Monday.

KevinShook commented 3 years ago

I think I might have it. I output meltflag for both models and crhmcode is seting meltflag to 1 in the interval 1995-06-21 00:00 through 1995-06-26 23:00, while the Borland version doesn't. It's being set by this condition which also sets the albedo:

   if(SWE[hh] <= 0.0) {
        Albedo[hh] = Albedo_bare[hh];
        winter[hh] = 0;
        meltflag[hh] = 0;
      }

Looking at the SWE values, the crhmcode values are, again, very slightly greater than zero, while the Borland versions are not.

We need to use a threshold for the SWE test. Why don't we try using the existing SWE threshold?

KevinShook commented 3 years ago

I think we should also check out the ratio of rainfall to precipitation calculated in classobs

  hru_rain[hh] = hru_p[hh]*ratio;
  hru_snow[hh] = hru_p[hh]*(1.0-ratio);

We are getting some very small values of SWE, which really should be taken care of image

If we set the ratio to 1.0 and 0.0 for values >= 0.999 and <= 0.001 (or similar values) then we should avoid the roundoff error SWE values

jhs507 commented 3 years ago

We need to use a threshold for the SWE test. Why don't we try using the existing SWE threshold?

We can certainly do this. My only concern would be that the parameter as we had specified it was a specific threshold for preventing evaporation. If you feel using the same number here makes sense in terms of the physical process we can certainly apply it.

We will have to get Logan to change the newly added parameter to a shared parameter.

However I think your second point is more important. If we can fix the root of the very small amount of snow problem this would possibly be the best in general. I will see if I can come up with reasonable numbers that have the desired effect.

KevinShook commented 3 years ago

Those are very good points. I agree that it's best to fix the root of the problem and to keep the code as similar as possible between the two versions

jhs507 commented 3 years ago

I added some code so I could see what ratio was set to at various points and looked at the first two years of the model.

Perciptiation_Ratio.obs.txt

As you observed there are many points where this is leading to extremely small values for rain or snow. However, simply setting the ratio to 0 or 1 when it approaches that value will not solve our problem. As the data I posted shows there are points where the ratio value is set as 1 but the nature of floating points and the fact that snow is calculated as ppt* (1.0 - ratio) means that we get extremely small amounts of snow.

I will see if I can come up with a more comprehensive fix.

KevinShook commented 3 years ago

I was afraid of that. What would happen if we just set snow = 0.0 or snow = 1.0, instead of setting the ratio? Would the roundoff error still occur?

jhs507 commented 3 years ago

I think that is a more correct approach and what I was going to explore first.

  hru_snow[hh] = 0.0;

  hru_rain[hh] = 0.0;

  if(hru_p[hh] > 0.0) //rain or snow determined by ice bulb ratio

    hru_rain[hh] = hru_p[hh]*ratio;

  hru_snow[hh] = hru_p[hh]*(1.0-ratio);

In a related area the above code is the calucualtion we are discussing and I think it might be having other unintended behavior but I would like to run it by you.

As it is written now when hru_p is less than zero we are guaranteed to have a value of 0 for rain however their can still be a non zero value for snow is this intended?

KevinShook commented 3 years ago

No, both rain and snow should never be less than zero. More bad coding (although IIRC the values are checked somewhere else).

jhs507 commented 3 years ago

But more specifically if hru_p is 0.0 then both hru_rain and hru_snow should be 0.0 as well?

Is it also safe to assume that hru_p = hru_rain + hru_snow?

KevinShook commented 3 years ago

Exactly. Conservation of mass requires that rain + snow = p. Neither can ever be negative.

jhs507 commented 3 years ago

Alright if that is the desired outcome that if statement should have both the setting of rain and snow within. As it stands now only rain is within the block of that if. I will fix that. It is unlikely that this causes issues but you may want to have it corrected in the borland code as well.

KevinShook commented 3 years ago

Thanks. I'll see if we get different values between the versions and will let Logan know about the changes.

jhs507 commented 3 years ago

I added some logic to consider all precipitation as rain when ratio is close to 1 and all as snow when it is close to zero.

This did have the intended effect of zeroing out snow values where previously they were very small values on the order of 10^-10 ect...

Unfortunately this did not bring evaporation, or radiation into alignment. Albedo, melftlag, and winter are not being set correctly on some steps still. I don't know yet if this is because our new calculation is not catching all near zero snow cases or if it is because my new solution alters things too much from the borland code. I am going to see if I can figure that out.

Here is the data if you want to look yourself.

evap_with_ratio_fix.zip

Here is the altered code for reference.

 double epsilon = 0.0001;

  hru_snow[hh] = 0.0;

  hru_rain[hh] = 0.0;

  if (hru_p[hh] > 0.0)  //rain or snow determined by ice bulb ratio
  {

      if ( abs(1.0 - ratio) <= epsilon) {
          //ratio is close enough to 1 that it should 100% rain
          hru_rain[hh] = hru_p[hh];
          hru_snow[hh] = 0.0;
      }
      else if (ratio <= epsilon) {
          //ratio is close enough to 0 that it should be 100% snow
          hru_rain[hh] = 0.0;
          hru_snow[hh] = hru_p[hh];
      }
      else {
          //calculate rain and snow normaly
          hru_rain[hh] = hru_p[hh] * ratio;
          hru_snow[hh] = hru_p[hh] * (1.0 - ratio);
      }
  }
KevinShook commented 3 years ago

Thanks. I suspect that we'll still have to use a zero SWE threshold as well. I you can push the code to your branch, I'll check it out.

jhs507 commented 3 years ago

I have pushed my code.

KevinShook commented 3 years ago

Thanks!

KevinShook commented 3 years ago

The albedo seems to be incorrect due to differences in the depth of SWE. This is occurring for depths >> 0. It could be an error in the albedo decay calculations, or it could be an error in the melt rate. It also seems that the program is not setting the Albedo to the soil value immediately when the snow depth is 0. I'll start taking a look at the albedo and melt calculations. OTOH, there has been some improvement in some of the previously incorrect values for net radiation as you can see for May 1995. all_net_rad_May_27_29_1995_fixed_precip_ratio

When I looked at the annual discharges, some years are improved but surprisingly some are worse. new_crhmcode_Borland_annual_discharge_precip_ratio_fix

KevinShook commented 3 years ago

I just looked through Classabedo. It's filled with floating point tests. This is going to take a while.

jhs507 commented 3 years ago

Yes, usually in cases like this it is standard practice to implement some sort of tolerance when making comparisons with floating point values.

We could do so relatively easily however the difference between single and double precision would still lead to slight differences of outcome even if we did back fill these comparisons into the borland code. Although I would guess they would be less.

However the question becomes what is the relevant tolerance for each comparison. In some cases it might make sense to have it parameterized when the tolerance represents a physical process that could be varied but if we are just trying to account for floating point issues we can just assume a small tolerance.

jhs507 commented 3 years ago

I have created a separate branch to explore implementing tolerances for all of the floating point comparisons. I will work on that there as I am not sure it is the best way forward but would like to explore it.

KevinShook commented 3 years ago

Sounds like a good idea. Looking at the Classalbedo code, many of the comparisons are more empirically-based than physically-based.

jhs507 commented 3 years ago

I am noticing that this bit of code in Classalbedo.cpp line 117 is possibly causing problems:

if(net_snowD[hh] > 0.25) { // SF = SWE*2 if density 0.005 (0.5cm)
          Albedo[hh] = Albedo[hh] + net_snowD[hh]*0.1*2.0; // daily value
          if(Albedo[hh] > Albedo_snow[hh])
            Albedo[hh] = Albedo_snow[hh];
          continue;
        }

There are many times where the value of net_snowD[hh] appears to be very close to 0.25 ie (0.2499999999999999) could you explain a bit about what is being tested here?

KevinShook commented 3 years ago

This appears to be one of the empirical relationships/tests that I was referring to. As far as I can tell (the comment makes zero sense), it's checking the daily snowfall, and adding a very small value to the albedo.

jhs507 commented 3 years ago

In the case where the daily snow is 0.2499999999999999 would it make sense to enter this branch or skip it?

KevinShook commented 3 years ago

I think we should be entering it

jhs507 commented 3 years ago

Alright, That is currently what my "fix" code is doing. However we will need to move this new style of comparison back to the borland code if we want to continue to compare the two versions.

KevinShook commented 3 years ago

Agreed. However, we can test the results against the old version to see if it improves things.

jhs507 commented 3 years ago

I did do such a test it seems that by adding a tolerance to all the comparisons in the albedo module that we moved towards closer to the borland in terms of results but I have indications that there are steps where we are moving in the opposite direction. The net effect seems to be getting closer to the borland results but there are still some places I am worried that we are "over correcting"

albedo with comparison tolerance.zip

KevinShook commented 3 years ago

Oof. Ok, thanks. This module looks to be a real pain.

jhs507 commented 3 years ago

I don't think there is a way to get a perfect match to the borland code without accounting for these floating point differences in the borland code as well. I am going to try a few different tolerances though to see if I can get closer without that.

KevinShook commented 3 years ago

Thanks. It looks like there are a lot of these tests in this module

KevinShook commented 3 years ago

Since you are creating a new branch, it would be great to use epsilon for all of the floating-point tests in Classalbedo. This code is critical because of the positive feedback it sets up. If the albedo is too low, then snowmelt is increased, which then further decreases the albedo. Could we try doing this? Thanks

jhs507 commented 3 years ago

That is what I did above, the data I posted is what happens when we place a comparison with tolerance in every floating point comparison in the albedo module.