CDAT / cdms

8 stars 10 forks source link

ESMF conservative blending when you have missing values #231

Open doutriaux1 opened 6 years ago

doutriaux1 commented 6 years ago

I'm using the following cdms2 on Linux cdms2 2.12.2018.02.22.17.40.g09431cb.npy1.13 py27_0 uvcdat/label/nightly

@gleckler1 @durack1 feel free to chime in

code:

from __future__ import print_function
import cdms2, vcs, MV2
import os

data = cdms2.open("data.nc")("ta")
print(data.shape)

tmp = cdms2.open("sft.nc")
sft = tmp("sftlf")
tmp.close()

data2 = MV2.masked_where(MV2.less(sft,50.),data)

tGrid = cdms2.createUniformGrid(-88.875, 72, 2.5, 0, 144, 2.5)
x=vcs.init()
for mthd in ["conservative", "linear"]:
    print("USING REGRID METHOD:",mthd)
    data3 = data2.regrid(tGrid, regridTool="esmf", regridMethod=mthd)
    print("pltting")
    x.plot(data3)
    print("pnging")
    x.png("masked_{}".format(mthd))
    print("clearing")
    x.clear()

produce: masked_conservative masked_linear

Data files:

data.zip

durack1 commented 6 years ago

@doutriaux1 I have also seen the same thing, which was a concern to me - @dnadeau4 are there toggles/keywords that we can load up the ESMPy call with to control such behaviour?

dnadeau4 commented 6 years ago

@doutriaux1 is this for py3 only?

gleckler1 commented 6 years ago

@dnadeau4 @doutriaux1 @durack1 The maps above would explain the ESMF conservative results we've been getting with PMP which so far is only py2.7. It is great that we are getting to the bottom of this!

doutriaux1 commented 6 years ago

@dnadeau4 as @gleckler1 mentioned py2, probably/hopefully py3 as well.

doutriaux1 commented 6 years ago

@dnadeau4 from your comments at the meeting, it looks like your'e regriding the mask, I think we need to err on the safe side and mask every cell where there is any fraction of mask, possibly add a threshold keywork to let the user control what the threshold should be before considering a cell as masked or not.

dnadeau4 commented 6 years ago

It seems that everything is working well, when I look at the input data. data2

doutriaux1 commented 6 years ago

@dnadeau4 yes input data are fine, it's the output that is an issue.

dnadeau4 commented 6 years ago

Not really, look there is data in Antarctica.

dnadeau4 commented 6 years ago

The values seem high, is that what you meant?

doutriaux1 commented 6 years ago

yes something is going on, the 2 pictures look really different

dnadeau4 commented 6 years ago

Ok, I see that the min is 0 in the conservative.

gleckler1 commented 6 years ago

@dnadeau4 @doutriaux1 @durack1 @taylor13 As a default case, it is better to be on the safe side, meaning there is no mixing of data types. As an example, consider surface wind stress, which can be nearly an order of magnitude larger over land than ocean. If we want to mask out land (we have obs of tauu/v over ocean only) we want to make sure that the interpolated data does not include any of the coastal land points. This may mean - to be safe - that some coastal ocean needs to get masked also... while loosing some valid area, it is better than allowing any land influence.

durack1 commented 6 years ago

@dnadeau4 what is the current default behavior? Can you point to the code snippet that is executed when regrid is called with ESMF specified?

dnadeau4 commented 6 years ago

@durack1 Thanks for your help.
Let me know if you find the error, this was written by Alex Pletzer and David Kinding.

https://github.com/UV-CDAT/cdms/blob/master/regrid2/Lib/mvESMFRegrid.py#L273-L332 https://github.com/UV-CDAT/cdms/blob/master/regrid2/Lib/mvGenericRegrid.py#L157-L287

dnadeau4 commented 6 years ago

@doutriaux1 I now have a test for this issue. Can you verify that this work for you and close the ticket?