Reading-eScience-Centre / ncwms

ncWMS - A Web Map Service for displaying environmental data over the web
Other
62 stars 30 forks source link

netCDF in projected x,y reports bounds in getcaps as CRS:84 #68

Open scaddenp opened 3 years ago

scaddenp commented 3 years ago

Have netCDF which is stored in a projected x,y coordinates and I am pretty it is set up to CF standard. However, the getCaps call returns layers details where latlngBoundingBox = BoundingBox and BoundingBox is reported with SRS of CRS:84.

WMS calls with the requested SRS being the same as the native SRS are rather slow and I am wondering whether the data reader is converting from native grid to CRS:84 and then WMS is converting to again to the requested SRS?

guygriffiths commented 3 years ago

The GetCapabilities <EX_GeographicBoundingBox> needs to be in CRS:84, and I think that's the only bounding box reported by GetCapabilties?

I don't think there will be any native -> CRS:84 -> native conversion, it should just read the native grid. If you can send a file/request URL combination that's giving you trouble I can have a look at it.

scaddenp commented 3 years ago

getCaps, in both 1.1.1 and 1.3.0 report

as well as Ex_geographicBoundingBox (1.3.0) or LatLonBoundingBox (1.1.x)

I have a backend mapping system which supports multiple different webservices (eg see https://data.gns.cri.nz/tez). Building it I have worked with many different implementations of WMS. in all of these, I successfully used either a dedicated tag or the SRS attribute or BoundingBox to determine the "native" SRS of a WMS layer. It isnt a big deal because it is only required for CQL queries with spatial predicates in systems that require those in native SRS. It did make my wonder though as to whether double projection was occuring. I want to look at the data extraction code anyway, so I will follow up on that.

guygriffiths commented 3 years ago

Ah, yes I see the issue, it's not differentiating between EX_GeographicBoundingBox and BoundingBox. Can you send me a data file which is giving you issues? I'm having trouble finding something in a different CRS (everything non-CRS:84 I have also includes 2d lat/lon, so it shows up as CRS:84).

scaddenp commented 3 years ago

I have sent you link to file via email. Doesnt cross the antemedian. I was under impression all CF-compliant netCDF needed lat/lon? The header for this one is: netcdf hpm_outputs { dimensions: time = 76 ; lay = 8 ; x = 501 ; y = 302 ; realn = 10 ; variables: int time(time) ; time:standard_name = "time" ; time:axis = "T" ; time:long_name = "time" ; time:units = "days since 1940-07-01 00:00:00" ; char transverse_mercator ; transverse_mercator:grid_mapping_name = "transverse_mercator" ; transverse_mercator:longitude_of_central_meridian = 173. ; transverse_mercator:false_easting = 1600000. ; transverse_mercator:false_northing = 10000000. ; transverse_mercator:latitude_of_projection_origin = 0. ; transverse_mercator:scale_factor_at_central_meridian = 0.9996 ; transverse_mercator:long_name = "CRS definition" ; transverse_mercator:longitude_of_prime_meridian = 0. ; transverse_mercator:semi_major_axis = 6378137. ; transverse_mercator:semi_minor_axis = 6356752.31414036 ; transverse_mercator:reference_ellipsoid_name = "GRS 1980" ; transverse_mercator:prime_meridian_name = "Greenwich" ; transverse_mercator:geographic_crs_name = "NZGD2000" ; transverse_mercator:horizontal_datum_name = "New Zealand Geodetic Datum 2000" ; transverse_mercator:projected_crs_name = "NZGD2000 / New Zealand Transverse Mercator 2000" ; transverse_mercator:inverse_flattening = 298.257222101 ; transverse_mercator:spatial_ref = "PROJCRS[\"NZGD2000 / New Zealand Transverse Mercator 2000\",BASEGEOGCRS[\"NZGD2000\",DATUM[\"New Zealand Geodetic Datum 2000\",ELLIPSOID[\"GRS 1980\",6378137,298.257222101,LENGTHUNIT[\"metre\",1]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]], ID[\"EPSG\",4167]],CONVERSION[\"New Zealand Transverse Mercator 2000\",METHOD[\"Transverse Mercator\", ID[\"EPSG\",9807]],PARAMETER[\"Latitude of natural origin\",0,ANGLEUNIT[\"degree\",0.0174532925199433], ID[\"EPSG\",8801]],PARAMETER[\"Longitude of natural origin\",173,ANGLEUNIT[\"degree\",0.0174532925199433], ID[\"EPSG\",8802]],PARAMETER[\"Scale factor at natural origin\",0.9996,SCALEUNIT[\"unity\",1],ID[\"EPSG\",8805]],PARAMETER[\"False easting\",1600000,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8806]],PARAMETER[\"False northing\",10000000,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8807]]], CS[Cartesian,2], AXIS[\"northing (N)\",north,ORDER[1],LENGTHUNIT[\"metre\",1]], AXIS[\"easting (E)\",east,ORDER[2],LENGTHUNIT[\"metre\",1]],USAGE[SCOPE[\"unknown\"],AREA[\"New Zealand - onshore\"],BBOX[-47.33,166.37,-34.1,178.63]],ID[\"EPSG\",2193]]" ; double lon(y, x) ; lon:units = "degrees_east" ; lon:long_name = "longitude coordinate" ; lon:standard_name = "longitude" ; double lat(y, x) ; lat:units = "degrees_north" ; lat:long_name = "latitude coordinate" ; lat:standard_name = "latitude" ; int lay(lay) ; lay:standard_name = "model_level_number" ; lay:long_name = "Model layer" ; lay:positive = "down" ; lay:axis = "Z" ; lay:units = "1" ; double x(x) ; x:standard_name = "projection_x_coordinate" ; x:long_name = "x coordinate of projection" ; x:axis = "X" ; x:units = "m" ; double y(y) ; y:standard_name = "projection_y_coordinate" ; y:long_name = "y coordinate of projection" ; y:axis = "Y" ; y:units = "m" ; int realn(realn) ; realn:long_name = "Realisation" ; realn:standard_name = "realization" ; realn:units = "1" ; float head(realn, time, lay, y, x) ; head:_FillValue = -1.e+30f ; head:grid_mapping = "transverse_mercator" ; head:long_name = "Simulated GW Head" ; head:standard_name = "(no standard name)" ; head:units = "m" ; head:coordinates = "lat lon" ; float s_flow(realn, time, y, x) ; s_flow:_FillValue = -1.e+30f ; s_flow:grid_mapping = "transverse_mercator" ; s_flow:long_name = "Simulated Stream Flow" ; s_flow:standard_name = "(no standard name)" ; s_flow:units = "m3/d" ; s_flow:coordinates = "lat lon" ; float s_flux(realn, time, y, x) ; s_flux:_FillValue = -1.e+30f ; s_flux:grid_mapping = "transverse_mercator" ; s_flux:long_name = "Simulated Stream Flux to GW" ; s_flux:standard_name = "(no standard name)" ; s_flux:units = "m3/d" ; s_flux:coordinates = "lat lon" ;

// global attributes: :title = "hpm model outputs" ; :description = "ensemble simulated outputs" ; :file_creation_time = "2020-08-10 17:03:38.054010" ; :Conventions = "CF-1.6" ; :institution = "GNS Science" ; :source = "PEST++/pyEMU" ; }

guygriffiths commented 3 years ago

I am wondering whether the data reader is converting from native grid to CRS:84 and then WMS is converting to again to the requested SRS?

I've looked into it and yes, this is indeed what's happening. Transformed grids are also reporting their native CRS as CRS:84, which is what is appearing in the capabilities document.

The underlying issue here is that the NetCDF-Java library we are using gives us a co-ordinate system with a Projection object which we can use to translate between CRS:84 and the native projection of the data (and vice versa). But what it doesn't give us is a nicely encapsulated description of the native projection, nor an EPSG identifier. Without those things, it's not possible to properly report the native projection, nor determine whether co-ordinates are in the same reference system.

So for your example data, what ncWMS is essentially doing is: Position in EPSG:8805 (from request URL) -> Position in CRS:84 -> Position in unidentified CRS (which happens to be EPSG:8805)

So it's not currently possible to fix this. I'm going to leave the issue open though, since the NetCDF-Java library is currently undergoing a fairly big revision, and there's a chance that this sort of thing may be supported in the not too distant future.

scaddenp commented 3 years ago

Ok. Good to know. Perhaps I will raise issue on that library. I notice the dependency is 4.5.1 whereas netcdf-java is currently at 5.3.3. Maybe I should verify the problem on latest library first?

ghansham commented 3 years ago

Dear All,

I want to add what my experience is with netcdf java. The CF complaint datasets are defining projection using the grid mapping logic which is generalized enough to define projections using projection parameters. The EPSG based system is mostly used in geotiff files. CF community has started with supporting CRS wkt strings. But still priority is given to standard CF projection variables. Lot of discussion is happening in CF community specfically opendatacube (Geosciences Australia) which stores data in netcdf is looking for stronger support for CRS WKT strings. May be if we can add wkt string if available via ncj library to be part of layerDetails request, it may partially address the issue.

Regards Ghansham

On Wed, Sep 23, 2020, 03:49 scaddenp notifications@github.com wrote:

Ok. Good to know. Perhaps I will raise issue on that library.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-697012284, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJBYEDI7D4CM7EED5XTSHEPFJANCNFSM4RPVQH3Q .

scaddenp commented 3 years ago

Ghansham, thanks for that perspective. I dont think it matters whether WKT or EPSG (rather heavily used for many things) as Apache SIS (and other systems) can do either but the core issue for me, is that if data is stored in a projected coordinate system, then can it be extracted directly in the same coordinate system without code doing unnecessary projection/unprojections. I havent yet had time to delve into the reader code to see what goes on.

ghansham commented 3 years ago

My perspective was just from only to get it reflected in metadata. Not for reprojection. Sorry if I miscommunicated.

On Wed, Sep 23, 2020 at 5:55 AM scaddenp notifications@github.com wrote:

Ghansham, thanks for that perspective. I dont think it matters whether WKT or EPSG (rather heavily used for many things) as Apache SIS (and other systems) can do either but the core issue for me, is that if data is stored in a projected coordinate system, then can it be extracted directly in the same coordinate system without code doing unnecessary projection/unprojections. I havent yet had time to delve into the reader code to see what goes on.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-697050748, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJAFAVLKVYQZJZK5XPLSHE6AXANCNFSM4RPVQH3Q .

guygriffiths commented 3 years ago

Ok. Good to know. Perhaps I will raise issue on that library. I notice the dependency is 4.5.1 whereas netcdf-java is currently at 5.3.3. Maybe I should verify the problem on latest library first?

I'm going to do a release this week that depends on 5.3.3, but we've been using the 5.x branch for years - where are you seeing the 4.5.1 dependency?

scaddenp commented 3 years ago

I just looked at the jars in the ncwms2 project. I didn't look at the pom.


From: Guy Griffiths notifications@github.com Sent: Wednesday, September 23, 2020 8:58:50 PM To: Reading-eScience-Centre/ncwms ncwms@noreply.github.com Cc: Phil Scadden P.Scadden@gns.cri.nz; Author author@noreply.github.com Subject: Re: [Reading-eScience-Centre/ncwms] netCDF in projected x,y reports bounds in getcaps as CRS:84 (#68)

Ok. Good to know. Perhaps I will raise issue on that library. I notice the dependency is 4.5.1 whereas netcdf-java is currently at 5.3.3. Maybe I should verify the problem on latest library first?

I'm going to do a release this week that depends on 5.3.3, but we've been using the 5.x branch for years - where are you seeing the 4.5.1 dependency?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-697231619, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AABW6EMY6YHDFDWKACJDLRLSHG2EVANCNFSM4RPVQH3Q.

Notice: This email and any attachments are confidential and may not be used, published or redistributed without the prior written consent of the Institute of Geological and Nuclear Sciences Limited (GNS Science). If received in error please destroy and immediately notify GNS Science. Do not copy or disclose the contents.

guygriffiths commented 3 years ago

Ah, you've probably seen that we're using version 5.1.0 of the netcdf4 library. The file is netcdf4-5.1.0.jar which scans as 4.5.1 at first (and often second and third!) glance.

ghansham commented 3 years ago

AFAIK, ncwms2 hardly ever used ncj 4.x. it was ncwms1 which used ncj 4.x. I may be wrong because in between I had lost track of the latest updates.

On Wed, Sep 23, 2020, 14:31 scaddenp notifications@github.com wrote:

I just looked at the jars in the ncwms2 project. I didn't look at the pom.


From: Guy Griffiths notifications@github.com Sent: Wednesday, September 23, 2020 8:58:50 PM To: Reading-eScience-Centre/ncwms ncwms@noreply.github.com Cc: Phil Scadden P.Scadden@gns.cri.nz; Author <author@noreply.github.com

Subject: Re: [Reading-eScience-Centre/ncwms] netCDF in projected x,y reports bounds in getcaps as CRS:84 (#68)

Ok. Good to know. Perhaps I will raise issue on that library. I notice the dependency is 4.5.1 whereas netcdf-java is currently at 5.3.3. Maybe I should verify the problem on latest library first?

I'm going to do a release this week that depends on 5.3.3, but we've been using the 5.x branch for years - where are you seeing the 4.5.1 dependency?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub< https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-697231619>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AABW6EMY6YHDFDWKACJDLRLSHG2EVANCNFSM4RPVQH3Q

.

Notice: This email and any attachments are confidential and may not be used, published or redistributed without the prior written consent of the Institute of Geological and Nuclear Sciences Limited (GNS Science). If received in error please destroy and immediately notify GNS Science. Do not copy or disclose the contents.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-697233373, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJFZ6SGMY7C7JJSDNXTSHG2QPANCNFSM4RPVQH3Q .

scaddenp commented 3 years ago

Ouch. You are right of course. it would have been more obvious if I had checked the pom.

Let me try and get a further understanding of the problem. There are two part to my mind: 1/ reporting the SRS in WMS 2/ extracting the data without reprojecting from the netcdf

1/ is a "nice to have". The Prj2EPSG API or the code from opengeo could do that. A rewrite would be desirable however to work with the Apache SIS resources.

2/ is more important. Do I understand that if KNOW the native coordinates in the netCDF are the same as the requested SRS, then it would be possible to extract directly? If so, then this should be achievable. A projection object can be created from requested SRS and components of projection object compared on a "fast fail" basis.

ghansham commented 3 years ago

@Guy Sir

Do we resample the data twice or only position calculations are only being done in the process mentioned by you. Nativegrid ->crs:84-> wms srs(which in this case happens to be native grid).

If only position calculations are being done, its only a miniscule performance hit. And we can always show numerically that effectively no change in terms of x,y coordinate value which is being read from the actual file in this two step transformation process. So there is no error in terms of calculation of position from where data is being read.

We need to raise this issue to ncj community: If there is a database and associated jar that can map grid mapping object to a well defined crs, we can skip transformedgrid if native grid==wms srs.

Just an idea..

Regards

On Thu, Sep 24, 2020, 02:45 scaddenp notifications@github.com wrote:

Ouch. You are right of course. it would have been more obvious if I had checked the pom.

Let me try and get a further understanding of the problem. There are two part to my mind: 1/ reporting the SRS in WMS 2/ extracting the data without reprojecting from the netcdf

1/ is a "nice to have". The Prj2EPSG API or the code from opengeo could do that. A rewrite would be desirable however to work with the Apache SIS resources.

2/ is more important. Do I understand that if KNOW the native coordinates in the netCDF are the same as the requested SRS, then it would be possible to extract directly? If so, then this should be achievable. A projection object can be created from requested SRS and components of projection object compared on a "fast fail" basis.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-697976860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJEFRYMCVADLE4HEF6TSHJQQDANCNFSM4RPVQH3Q .

scaddenp commented 3 years ago

I am pretty sure data is read once but projecting a entire grid twice is a computational cost. To me, the ncWMS is pretty slow compared to other WMS sources (we make heavy use of Geoserver and ArcGIS Server). Part of that could be improved with tile caching, but eliminating unnecessary calculations should also help.

ghansham commented 3 years ago

I have been using ncwms1 for years. Recently only I switched to ncwms2. By default ncwms2 uses scanline strategy while reading data for local netcdf files. For remote an compressed files, it uses bounding box strategy. I found the moment I switched to boundingbox for my files, my tile generation was much faster. I inserted system.currenttimemills and came down from 100ms to 6ms. Interestingly it is not possible to configure it from outside but it worked for me. If you have a high end machine, I would suggest you to use boundingbox. Bounding box may be more memory intensive but it will do less io calls. See rather than using 256datatypesize buffer you are using 256256*datatypesize.

May be the topic has diverged from projected coordinates, but if you are concerned about performance, it will surely help you.

Regards Ghansham

On Thu, Sep 24, 2020, 05:34 scaddenp notifications@github.com wrote:

I am pretty sure data is read once but projecting a entire grid twice is a computational cost. To me, the ncWMS is pretty slow compared to other WMS sources (we make heavy use of Geoserver and ArcGIS Server). Part of that could be improved with tile caching, but eliminating unnecessary calculations should also help.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-698034498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJCGQP6CB3FYPDG3BVDSHKEI7ANCNFSM4RPVQH3Q .

scaddenp commented 3 years ago

Got a hint on how to do that switch? is that fiddling with CdmUtils?

ghansham commented 3 years ago

Yes switch this to bb https://github.com/Reading-eScience-Centre/edal-java/blob/d3ec32b5beca4e8f43009d69f761e2382ae4adc6/cdm/src/main/java/uk/ac/rdg/resc/edal/util/cdm/CdmUtils.java#L144

On Thu, Sep 24, 2020, 07:46 scaddenp notifications@github.com wrote:

Got a hint on how to do that switch? is that fiddling with CdmUtils?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-698070488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJDAHRFNOYDPH2NQGXTSHKTZVANCNFSM4RPVQH3Q .

scaddenp commented 3 years ago

Thanks. I will set up a timing test.

guygriffiths commented 3 years ago

Do I understand that if KNOW the native coordinates in the netCDF are the same as the requested SRS, then it would be possible to extract directly? If so, then this should be achievable. A projection object can be created from requested SRS and components of projection object compared on a "fast fail" basis.

Yes, essentially this behaviour is encapsulated in uk.ac.rdg.resc.edal.grid.cdm.CdmTransformedGrid#findIndexOf. It would need some modification, but if we had a method to compare SRS code with the Projection object and say whether they're equivalent, it'd be a straightforward change.

scaddenp commented 3 years ago

Just getting back to this, and tried the change from scanline to bounding box. It is faster but not dramatically so. First time run (netCDF not in cache) it was 2.1sec per tile versus 2.4s for total load time of about 7.5s versus 8.4. Second run (netCDF in cache) it took total tile load time of 1.8sec versus 2.45. I suspect changing client to single tile load instead of tiles wms would be faster, but the antemeridian is likely a show stopper. I think a tile cache might be a better solution.

ghansham commented 3 years ago

Are you using compressed netcdf files? Can you share a sample file?

On Thu, 22 Oct, 2020, 5:58 am scaddenp, notifications@github.com wrote:

Just getting back to this, and tried the change from scanline to bounding box. It is faster but not dramatically so. First time run (netCDF not in cache) it was 2.1sec per tile versus 2.4s for total load time of about 7.5s versus 8.4. Second run (netCDF in cache) it took total tile load time of 1.8sec versus 2.45. I suspect changing client to single tile load instead of tiles wms would be faster, but the antemeridian is likely a show stopper. I think a tile cache might be a better solution.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-714054533, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJHXSSGNS7TJSUOQ4ILSL54DPANCNFSM4RPVQH3Q .

scaddenp commented 3 years ago

Possibly - I admit that I havent peered closely at the file - I just used it for testing as I knew it was slow - and crossed the antemeridian. It is 600MB and can be found here https://share.gns.cri.nz/SPCKRK82KYT0/NZ_regional_BA_onshore_FAA_offshore_15as.nc.html

I could repeat tests with a smaller file, but I was looking for something that would highlight the difference.

ghansham commented 3 years ago

Yeah that's ok. I will have a look and let you know.

On Thu, 22 Oct, 2020, 7:24 am scaddenp, notifications@github.com wrote:

Possibly - I admit that I havent peered closely at the file - I just used it for testing as I knew it was slow - and crossed the antemeridian. It is 600MB and can be found here https://share.gns.cri.nz/SPCKRK82KYT0/NZ_regional_BA_onshore_FAA_offshore_15as.nc.html

I could repeat tests with a smaller file, but I was looking for something that would highlight the difference.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Reading-eScience-Centre/ncwms/issues/68#issuecomment-714171223, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJESCZ74KF6YN3MOD3LSL6GEVANCNFSM4RPVQH3Q .