twhiteaker / CFGeom

CF Convention for Representing Simple Geometry Types
MIT License
9 stars 4 forks source link

R-based reference implementation #9

Closed bekozi closed 7 years ago

bekozi commented 8 years ago

The reference implementation for CF simple geometries written in R.

dblodgett-usgs commented 8 years ago

Got this well under way here: https://github.com/dblodgett-usgs/NCDFSG

dblodgett-usgs commented 8 years ago

I've completed an initial ToNCDFSG function that takes standard R sp objects with or without attributes and writes them into basic files that follow the proposed data structure. It is composed of three sub functions, one for instance variable, one for geometry, and one for points. It is also capable of taking lat/lon vectors rather than an sp object of points. I'm going to call this done for the current requirements. As time allows I hope to implement a 'FromNCDFSG function to round trip things, but I need to turn my focus to other things... including the poster and presentation!

For the sake of doing it, I got it building with travis https://travis-ci.org/dblodgett-usgs/NCDFSG and there is 100% test coverage, for what that's worth. There are certainly some edge cases in the data-standard that I'm not testing, but I have done a fairly complete job of making sure features are written faithfully.

See tests like: https://github.com/dblodgett-usgs/NCDFSG/blob/master/tests/testthat/test_line.R#L28

and https://github.com/dblodgett-usgs/NCDFSG/blob/master/tests/testthat/test_poly.R#L53

bekozi commented 8 years ago

Congrats @dblodgett-usgs! Something to think about - does it make sense to have your R implementation as part of the main simple geometries repo?

dblodgett-usgs commented 8 years ago

Not sure how mixing Python and R in the same repo would work... Travis would be unhappy. Different branches?

I'm actually thinking about putting it into another R package... https://github.com/USGS-R/netcdf.dsg

It will be available on GRAN and maybe eventually CRAN.

dblodgett-usgs commented 8 years ago

I've finished up my read/write work. Going to declare victory for now. The week before AGU, I'll probably start incorporating this functionality into the main netcdf.dsg package and test it out on some real model data with some USGS colleagues.

bekozi commented 8 years ago

Not sure how mixing Python and R in the same repo would work... Travis would be unhappy. Different branches?

I'm pretty sure we could work it out. Why would Travis be unhappy? I haven't looked into it so really have no idea.

I had planned on having R source directory under src. Hence the src/python folder. I'm wary of fragmenting too early, but it is your code.

dblodgett-usgs commented 8 years ago

I smell what you're cooking. It would make sense to keep the reference implementation all in one place. Looks like it's possible to do it... https://docs.travis-ci.com/user/speeding-up-the-build/#Parallelizing-your-builds-across-virtual-machines

bekozi commented 8 years ago

Interesting. I'll open a ticket to investigate merging the Python and R Travis testing scripts.

bekozi commented 8 years ago

Our intention has been to use "continuous" as opposed to "contiguous": https://github.com/dblodgett-usgs/NCDFSG/blob/59019891449de049cbc78d730372a62c14135ebc/R/addGeomData.R#L113. Let me know if this assumption is mistaken!

bekozi commented 8 years ago

I'm wrong...it's contiguous. :cry:

dblodgett-usgs commented 8 years ago

I've made that type SOOOO many times!!!

dblodgett-usgs commented 7 years ago

@twhiteaker I just updated my R implementetion... removed the geom_dimension, added the featureType attribute, and added cf_role to the x/y coordinate data. Care to take a look over here: https://github.com/bekozi/netCDF-CF-simple-geometry/wiki/Examples:-Contiguous-Ragged-Arrays and see if you see anything amiss?

twhiteaker commented 7 years ago

Just started looking at Point. I hadn't thought about what we leave out for the simple point case. The example point CDL doesn't include the index variables, and there's no explicit indication that a point geometry is being described. Flagged as #52.

On to the Line example. Looks like coordinate index starts at 1. It should start at zero according to section 9.3.5 of the proposed CF spec. The coordinate stop should instead be a coordinate start (also from 9.3.5).

The multipoint example is really just a point example with four instances. Can you change that to a multipoint example with a single instance with four parts? You oughta have a coordinate index and a coordinate start variable in there.

The rest looks pretty good. I didn't check the actual coordinates for the last example. The rest of the coordinates look good assuming the WKT is correct.

What's the emoji for uncrossing your eyes after staring at CDL?

dblodgett-usgs commented 7 years ago

I'm going to close this and reopen a summary of changes to the R reference implementation based on our new direction.