ncss-tech / ISSR-800

Intermediate Scale Soil Raster (data) prepared on an 800m grid.
https://casoilresource.lawr.ucdavis.edu/soil-properties/
8 stars 0 forks source link

negative values in om_kg_sq_m: why? #11

Closed dylanbeaudette closed 5 years ago

dylanbeaudette commented 6 years ago

Some clues:

select mukey, cokey, om_kg_sq_m from ssurgo_component_data where om_kg_sq_m < 0;
  mukey  |  cokey   | om_kg_sq_m
---------+----------+------------
 466806  | 14523237 |   -0.06392
 465105  | 14572774 |   -0.26001
 2586536 | 15317866 |  -0.669425
 2375687 | 15426364 |  -0.669425
 2375696 | 15426390 |    -3.4001

Further investigation:

SELECT component.areasymbol, compname, chkey, chorizon.cokey, 
-- summed values: convert OM and CaCO3 to fractions
hzdept_r AS top, hzdepb_r AS bottom, (hzdepb_r - hzdept_r) as thick, 
COALESCE(soil_fraction, 1) as soil_fraction,
om_r/100.0 as om,
-- weighted variables
dbovendry_r as db, CASE WHEN dbovendry_r is NULL THEN NULL ELSE (hzdepb_r - hzdept_r) END as db_wt
FROM ssurgo.chorizon 
JOIN ssurgo.component USING (cokey)
LEFT JOIN
    (
    -- total rock fragment percent
    SELECT chkey, (100.0 - sum(COALESCE(fragvol_r, 0))) / 100.0 as soil_fraction
    FROM ssurgo.chfrags
    GROUP BY chkey
    ) as frag_data
USING (chkey)
WHERE cokey IN (select cokey from ssurgo_component_data where om_kg_sq_m < 0)
ORDER BY cokey, top ASC;
 areasymbol |          compname          |  chkey   |  cokey   | top | bottom | thick |      soil_fraction      |   om   |  db  | db_wt
------------+----------------------------+----------+----------+-----+--------+-------+-------------------------+--------+------+-------
 ca699      | Pacific Mesa               | 41824400 | 14523237 |   0 |      8 |     8 | -0.10000000000000000000 | 0.0025 | 1.72 |     8
 ca699      | Pacific Mesa               | 41824401 | 14523237 |   8 |     38 |    30 | -0.06000000000000000000 |  0.001 | 1.64 |    30
 ca699      | Pacific Mesa               | 41824402 | 14523237 |  38 |     63 |    25 |                       1 |        |      |
 ca750      | Dystric Lithic Xerochrepts | 41989006 | 14572774 |   0 |     18 |    18 | -0.18000000000000000000 |  0.015 | 1.51 |    18
 ca750      | Dystric Lithic Xerochrepts | 41989005 | 14572774 |  18 |     28 |    10 |  0.39000000000000000000 | 0.0075 | 1.62 |    10
 ca750      | Dystric Lithic Xerochrepts | 41989004 | 14572774 |  28 |     53 |    25 |                       1 |        |      |
 sd607      | Udarents                   | 44336718 | 15317866 |   0 |      5 |     5 | -0.05000000000000000000 |  0.005 | 1.81 |     5
 sd607      | Udarents                   | 44336717 | 15317866 |   5 |    152 |   147 | -0.05000000000000000000 |  0.005 | 1.76 |   147
 sd081      | Udarents                   | 44699350 | 15426364 |   0 |      5 |     5 | -0.05000000000000000000 |  0.005 | 1.81 |     5
 sd081      | Udarents                   | 44699349 | 15426364 |   5 |    152 |   147 | -0.05000000000000000000 |  0.005 | 1.76 |   147
 sd081      | Udarents                   | 44699473 | 15426390 |   0 |     10 |    10 |  0.59000000000000000000 |   0.01 | 1.65 |    10
 sd081      | Udarents                   | 44699474 | 15426390 |  10 |    152 |   142 | -0.35000000000000000000 |  0.005 | 1.76 |   142

It looks like the calculation of soil_fraction is the culprit. Likely fragment volume > 100%.

-- total rock fragment percent by horizon
SELECT chkey, (100.0 - sum(COALESCE(fragvol_r, 0))) / 100.0 as soil_fraction
FROM ssurgo.chfrags
WHERE chkey IN (SELECT chkey from ssurgo.chorizon WHERE cokey IN (select cokey from ssurgo_component_data where om_kg_sq_m < 0))
GROUP BY chkey;
  chkey   |      soil_fraction
----------+-------------------------
 44336717 | -0.05000000000000000000
 41989005 |  0.39000000000000000000
 44336718 | -0.05000000000000000000
 41824401 | -0.06000000000000000000
 44699349 | -0.05000000000000000000
 41989006 | -0.18000000000000000000
 44699350 | -0.05000000000000000000
 44699474 | -0.35000000000000000000
 44699473 |  0.59000000000000000000
 41824400 | -0.10000000000000000000

Affected SSA/components / horizons

 areasymbol |  cokey   |  chkey
------------+----------+----------
 ca699      | 14523237 | 41824400
 ca699      | 14523237 | 41824401
 ca699      | 14523237 | 41824402
 ca750      | 14572774 | 41989004
 ca750      | 14572774 | 41989005
 ca750      | 14572774 | 41989006
 sd607      | 15317866 | 44336717
 sd607      | 15317866 | 44336718
 sd081      | 15426364 | 44699349
 sd081      | 15426364 | 44699350
 sd081      | 15426390 | 44699473
 sd081      | 15426390 | 44699474
dylanbeaudette commented 6 years ago

The solution: truncate soil_fraction at 10% (90% fragment volume).

How many horizon records are above this threshold? 2192 / 3260926 (0.067%)

Comparing new approach vs. old:

SELECT component.areasymbol, compname, chkey, chorizon.cokey, 
-- summed values: convert OM and CaCO3 to fractions
hzdept_r AS top, hzdepb_r AS bottom,
CASE WHEN COALESCE(soil_fraction, 1) < 0.1 THEN 0.1 ELSE COALESCE(soil_fraction, 1) END as soil_fraction,
COALESCE(soil_fraction, 1) as soil_fraction_old
FROM ssurgo.chorizon 
JOIN ssurgo.component USING (cokey)
LEFT JOIN
    (
    -- total rock fragment percent
    -- note: data population errors could result in negative values
    SELECT chkey, (100.0 - sum(COALESCE(fragvol_r, 0))) / 100.0 as soil_fraction
    FROM ssurgo.chfrags
    GROUP BY chkey
    ) as frag_data
USING (chkey)
WHERE cokey IN (select cokey from ssurgo_component_data where om_kg_sq_m < 0)
ORDER BY cokey, top ASC;
 areasymbol |          compname          |  chkey   |  cokey   | top | bottom |     soil_fraction      |    soil_fraction_old
------------+----------------------------+----------+----------+-----+--------+------------------------+-------------------------
 ca699      | Pacific Mesa               | 41824400 | 14523237 |   0 |      8 |                    0.1 | -0.10000000000000000000
 ca699      | Pacific Mesa               | 41824401 | 14523237 |   8 |     38 |                    0.1 | -0.06000000000000000000
 ca699      | Pacific Mesa               | 41824402 | 14523237 |  38 |     63 |                      1 |                       1
 ca750      | Dystric Lithic Xerochrepts | 41989006 | 14572774 |   0 |     18 |                    0.1 | -0.18000000000000000000
 ca750      | Dystric Lithic Xerochrepts | 41989005 | 14572774 |  18 |     28 | 0.39000000000000000000 |  0.39000000000000000000
 ca750      | Dystric Lithic Xerochrepts | 41989004 | 14572774 |  28 |     53 |                      1 |                       1
 sd607      | Udarents                   | 44336718 | 15317866 |   0 |      5 |                    0.1 | -0.05000000000000000000
 sd607      | Udarents                   | 44336717 | 15317866 |   5 |    152 |                    0.1 | -0.05000000000000000000
 sd081      | Udarents                   | 44699350 | 15426364 |   0 |      5 |                    0.1 | -0.05000000000000000000
 sd081      | Udarents                   | 44699349 | 15426364 |   5 |    152 |                    0.1 | -0.05000000000000000000
 sd081      | Udarents                   | 44699473 | 15426390 |   0 |     10 | 0.59000000000000000000 |  0.59000000000000000000
 sd081      | Udarents                   | 44699474 | 15426390 |  10 |    152 |                    0.1 | -0.35000000000000000000
dylanbeaudette commented 6 years ago

I have edited the related code in my processing scripts and the changes will be available in the next cut of the raster data.

The problems fixed in this issue are related to #10 and all other cases where organic matter is much lower than expected.