ERDDAP / erddap

ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).
Creative Commons Zero v1.0 Universal
84 stars 58 forks source link

include web Mercator projection (EPSG:3857) in ERDDAP WMS for using easy leaflet maps with nice basemaps #24

Open bbest opened 5 years ago

bbest commented 5 years ago

CC: @ehazen, @7yl4r,

Hi @BobSimons,

Thank you, thank you for a great and useful product of ERDDAP focused on the many ways to slice, dice and represent data for a plethora of analysis and visualization.

One of the more useful and easy representations of data is as a Web Map Service (WMS) to easily drop into a "slippy" online map using JavaScript libraries like leaflet and all the pretty basemap's like Esri.OceanBasemap and others from Leaflet Provider Demo:

image

However, the ERDDAP WMS currently only serves geographic projection EPSG:4326 and not the web Mercator EPSG:3857 needed to use these great basemaps. Heres' some futzing I did trying various combinations to make this work:

WMS Testing for Fixing Explorer Environmental Map

image

Could this be a matter of simply adding or updating instances in these Java files where 4326 shows up?

Thanks very much, Ben

BobSimons commented 5 years ago

Yes. "Add support for Web Mercator projection" is already on the To Do list.

Because of the nature of the graphics library that ERDDAP uses to generate maps (which was selected long ago, before most of the current libraries existed), it is not possible to simply add a new projection. So it is a more difficult project. Similarly, it would/will not be easy to switch to another graphing library. The To Do list is long. I just haven't done this item yet. I'm sorry.

bbest commented 5 years ago

Ok, glad to hear Web Mercator projection for slippy maps is on your TODO. Too bad it's not an easy fix. I'm not a Java programmer, but please let me know if I can help.

Seems like a common approach for serving WMS image and other OGC services (WCS raster, WFS vector) is to setup a GeoServer alongside ERDDAP for the same data sources. For instance, I see this with these projects:

This might not be the best place to ask, but do you have any pointers on how to best to link ERDDAP and GeoServer?

Ideally, you could discover one through the other (eg metadata mention of alternate providers through catalog search), but not sure where such information would live other than highest level of a website.

CC'ing Axiom folks who work setup docker-erddap and docker-geoserver in case they might have any tips: @kwilcox, @shane-axiom, @rsignell-usgs

kwilcox commented 5 years ago

@bbest I'm going to guess the dataset you are trying to access through ERDDAP's WMS is netCDF based, and the short answer is that netCDF and GeoServer don't play nicely together. There was some movement towards a GeoServer netCDF backend at some point.. maybe 8 years ago... but I believe progress stalled and was never completed. This could use a little bit of research (let me know if you find anything recent).

If you are looking for WMS from a netCDF data source in a projection other than 4326, you'll have to look towards:

Can you link me to the ERDDAP dataset you are trying to plot? I'll poke at it to see if it moves.

ghost commented 5 years ago

IIRC when I looked at the GeoServer netCDF reader a few years ago we decided against using it because it didn't handle a lot of real world complexities (e.g. anti-meridian wrapping) and didn't include support for community conventions like CF because it didn't make use of Unidata's netcdf-java library. That being said, if your use case isn't complex you might take a look and see if it works for you.

bbest commented 5 years ago

Thank you very much @kwilcox and @srstclair for weighing in on this given all your experience. I'm generally interested in quickly visualizing remotely sensed variables like SST and chlorophyll globally (ie WMS), as well as summarizing these data over time by arbitrary polygons (ie ERDDAP slicing/fetching). For example from the example I previously shared WMS Testing for Fixing Explorer Environmental Map, I used the Multi-scale Ultra-high Resolution (MUR) SST Analysis using R's leaflet:

addWMSTiles(
    baseUrl = 'https://coastwatch.pfeg.noaa.gov/erddap/wms/jplMURSST41mday/request?',
    layers = "jplMURSST41mday:sst",
    options = WMSTileOptions(
      version = "1.3.0", format = "image/png", transparent = T, opacity = 0.7,
      time = "2018-07-16T00:00:00Z"))

The ncWMS and sci-wms look great, particularly for slicing by time and variable. The NetCDF — GeoServer 2.15.x User Manual suggests good support for COARDS compliance (custom, Time, Elevation, Lat, Lon).

I'm also working with MBON and interested in being able to programmatically search the catalog of datasets in the portal at https://mbon.ioos.us developed by Axiom, but not sure how to do that, even with the excellent documentation. Do you know the base ERDDAP URL? I'm able to search through the GeoServer OGC GetCapabilities endpoints, but that's just a short name and not the other metadata, hence interest in how datasets could be cross-tagged across ERDDAP and GeoServer.

mwengren commented 5 years ago

@kwilcox @srstclair @bbest I dug up some old links on GeoServer and netCDF support I had been forwarded previously. NOAA/NWS had funded recently (< 8 yrs ago) some work to allow GeoServer to support a particular model dataset (Rapid Refresh: RAP) in GRIB2 . Info here:

https://github.com/bencaradocdavies/geoserver/wiki/RAP-Native-Grid and https://github.com/bencaradocdavies/geoserver/wiki/RAP-Native-Grid-ImageMosaic

Including use of netCDF-Java to read the source data to feed to GeoTools and in turn GeoServer (see the first link).

However, I'm not sure whether they were merged into the primary GeoServer product or if it's only a custom build as outlined in the wiki link. It looks like part of the project involved adding support for the specific RAP grid to netCDF-Java, so clearly netCDF-Java is part of the equation for at least this fork.

There's a lot of configuration involved in order to make GeoServer read netCDF (or GRIB2 in this case) via GeoTools ImageMosaic, so that's still a limitation/inconvenience, but maybe even so there's a path forward to make ERDDAP and GeoServer play well together. I would be interested to see that. It could go a ways towards adding some of the functionalities you're looking for to ERDDAP (or maybe alongside ERDDAP).

Maybe the magic of GitHub mentions will help, if @bencaradocdavies can share if this has gone anywhere since he worked on it.

bencaradocdavies commented 5 years ago

@mwengren @kwilcox @srstclair @bbest the current GeoServer NetCDF ("netcdf") plugin is built on the Unidata NetCDF-Java library and is well supported. I added a NetCDF example in the GeoServer Quickstart in the OSGeoLive image.

The NetCDF Output ("netcdf-out") plugin is used for WCS delivery of NetCDF 3 and 4. I do not think this what you want.

The RAP Native work I did in 2016-2017 added projection support and netcdf-out improvements. It was all merged into GeoServer 2.9. The use-case in the links above is the most complicated, and involves a mosaic that concatenates multiple GRIB2 files for delivery as NetCDF via WCS. GeoServer configuration to deliver WMS from NetCDF is much, much simpler.

BobSimons commented 5 years ago

Connecting ERDDAP and GeoServer has been on my list of things to To Do (or at least explore) for years. If any of you want to pursue it, please do. If, as it sounds, there is a netcdf-java to GeoServer connection, then it seems likely that there would be a way to take all the information that an ERDDAP has about its datasets and automatically make corresponding datasets in GeoServer (at least for the gridded datasets). I suspect that the current netcdf-java to GeoServer connection assumes that the data is gridded geographic data. It would also be interesting to see if the tabular datasets in ERDDAP which have a CF DSG/cdm_data_type/featureType could also be made into appropriate datasets in GeoServer and hence available via GeoServer's WFS. If anyone wants to work on this and wants to bounce ideas off me, please email me.

bencaradocdavies commented 5 years ago

@BobSimons I suspect that you are right: as far as I know, GeoServer supports gridded NetCDF data sources ("coverages") but not NetCDF vector data sources (equivalent to ERDDAP tabular?). The GeoServer devel list would be a great place to start. I am no longer active in the GeoServer project but there are several active developers supporting NetCDF data sources. The NetCDF libraries are there; support would require a GeoTools DataStore implementation and a GeoServer plugin to package it and possibly add a web UI for configuration. A prototype DataStore could even be a GSoC project.

mwengren commented 5 years ago

@bencaradocdavies thanks for the update!

That would be great if an ERDDAP GeoTools DataStore implementation could be built, at least for gridded datasets. I like the idea of a GSoC project as well.

@BobSimons if you wanted to suggest that, you could go through the ESIP GSoC repository to suggest it, see: https://github.com/ESIPFed/gsoc/issues. I don't know if they are still accepting new project ideas or not, but they may still be!

Someone would have to provide guidance from the GeoTools standpoint too, since it would obviously be developed in GeoTools. OSGeo also participates in GSoC. Maybe someone in the GeoTools project could be looped in later. What do you think?

On the projection question, I am not sure if gridded geographic data is a requirement or not, it may be that as long as the projection can be specified within GeoTools/GeoServer, any projection may work (maybe someone knows better about this than I do). GeoServer can reproject dynamically, of course at a performance cost.

bencaradocdavies commented 5 years ago

@mwengren the geotools-devel mailing list should be the first stop for anyone contemplating a new DataStore implementation because you can find other interested developers and mentors. The GeoTools Developer Guide has more. It is straightforward to get permission to create a new unsupported module and the commit access to maintain it. Contributors need to submit signed code contribution agreements; this can be the slowest step for employees of bureaucracies!

bbest commented 5 years ago

Thank you all for weighing in on this issue around connecting GeoServer and ERDDAP. Perhaps I don't understand the suggestions for more direct integration between GeoServer and ERDDAP, but having a separate data catalog that describes the various representations and web services for a given dataset seems the way to go to me.

The logical choice for data catalog software is CKAN, which is in keeping and replicable with data.ioos.us (see github.com/ioos/catalog) and data.gov (see github.com/GSA/data.gov).

This way the underlying dataset storage could be much more flexible than any one suite of software and be transformed for optimization of services. For instance, netcdf could be served through ERDDAP, but also transformed into a stack of TIFFs, which may be more efficient for serving through GeoServer (as suggested by this tutorial Using the ImageMosaic plugin for raster time-series data — GeoServer 2.16.x User Manual).

Decoupling the data catalog from the underlying data representations and web services enables many more possibilities, while ensuring queryable access through the CKAN API with Python and R clients.

I'm curious what you all think about this approach which seems to already be in place?

Overview

netcdf → ERDDAP → various formats (csv, png, ...)   ↘
  ↓                                                   → CKAN
tiff → GeoServer → OGC services (WMS, WFS, WCS,...) ↗

datasets | data.ioos.us

Note how ERDDAP and other web services using GeoServer (ie WMS, WCS) are both visible and queryable for dataset search and browse:

image

dataset detail | data.ioos.us

Note how all the links are available through the catalog to a variety of representations served by different endpoints, ie ERDDAP, GeoServer or other...

image ... image

BobSimons commented 5 years ago

The system you propose may make sense for various reasons, but not for the immediate need here and not for the other uses I had hoped for, which is to allow ERDDAP to use the WMS/mapping/reprojection features of GeoServer to make Web Mercator maps (and do lots of other things). For these needs and for ease of setup, I think it would be great if an administrator could install ERDDAP and GeoServer, and then have all of the datasets in ERDDAP automatically also show up as datasets in GeoServer. ERDDAP has aggregated datasets (not just a bunch of .nc files), with metadata about each dataset (enhanced over what is in the .nc files) and it would be nice if that information could be used to automate the connection to GeoServer. Then, ERDDAP could know that there are corresponding datasets in GeoServer and thus use/rely on those services. With your architecture, those things are not true and it would be more work for the administrator (setting up datasets separately in ERDDAP, GeoServer, and CKAN), and working to make the metadata the same in ERDDAP, GeoServer and CKAN..

But that's just my view for my/ERDDAP's purposes. Anyone can set up any architecture they want.

On Mon, Mar 18, 2019 at 11:55 AM Ben Best notifications@github.com wrote:

Thank you all for weighing in on this issue around connecting GeoServer and ERDDAP. Perhaps I don't understand the suggestions for more direct integration between GeoServer and ERDDAP, but having a separate data catalog that describes the various representations and web services for a given dataset seems the way to go to me.

The logical choice for data catalog software is CKAN https://ckan.org/, which is in keeping and replicable with data.ioos.us (see github.com/ioos/catalog) and data.gov (see github.com/GSA/data.gov).

This way the underlying dataset storage could be much more flexible than any one suite of software and be transformed for optimization of services. For instance, netcdf could be served through ERDDAP, but also transformed into a stack of TIFFs, which may be more efficient for serving through GeoServer (as suggested by this tutorial Using the ImageMosaic plugin for raster time-series data — GeoServer 2.16.x User Manual https://docs.geoserver.org/latest/en/user/tutorials/imagemosaic_timeseries/imagemosaic_timeseries.html ).

Decoupling the data catalog from the underlying data representations and web services enables many more possibilities, while ensuring queryable access through the CKAN API https://docs.ckan.org/en/2.8/api/ with Python https://github.com/ckan/ckanapi and R https://github.com/ropensci/ckanr clients.

I'm curious what you all think about this approach which seems to already be in place? Overview

netcdf → ERDDAP → various formats (csv, png, ...) ↘

↓ → CKAN

tiff → GeoServer → OGC services (WMS, WFS, WCS,...) ↗

datasets | data.ioos.us

Note how ERDDAP and other web services using GeoServer (ie WMS, WCS) are both visible and queryable for dataset search and browse:

[image: image] https://user-images.githubusercontent.com/2837257/54555122-4fc6b500-4973-11e9-93c7-039c82d6de4d.png dataset detail | data.ioos.us

Note how all the links are available through the catalog to a variety of representations served by different endpoints, ie ERDDAP, GeoServer or other...

[image: image] https://user-images.githubusercontent.com/2837257/54555165-6705a280-4973-11e9-9abd-ac3721964e3c.png ... [image: image] https://user-images.githubusercontent.com/2837257/54555195-72f16480-4973-11e9-913b-cb1fcbb4fbf9.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BobSimons/erddap/issues/24#issuecomment-474054997, or mute the thread https://github.com/notifications/unsubscribe-auth/ABarOIMmfDz-jDWN3ZsgPohMFDlz5-ksks5vX-EvgaJpZM4apFpZ .

-- Sincerely,

Bob Simons IT Specialist Environmental Research Division NOAA Southwest Fisheries Science Center 99 Pacific St., Suite 255A Monterey, CA 93940 Phone: (831)333-9878 Fax: (831)648-8440 Email: bob.simons@noaa.gov

The opinions in this message are mine personally and do not necessarily reflect any position of the U.S. Government or the National Oceanic and Atmospheric Administration. <>< <>< <>< <>< <>< <>< <>< <>< <><

bbest commented 5 years ago

Ah thank you @BobSimons for this extra insight. I will be developing a data management portal for IEA ecosystem status indicators, so will become more familiarized with ERDDAP configuration. In the process I can chip away at sync'ing datasets between ERDDAP and GeoServer initially with some simple scripts and keep these considerations in mind for trying to generalize more broadly for a tighter integration. Everybody's input here has been much appreciated 🙏

7yl4r commented 5 years ago

Apologies for hijacking this thread with a tangent, but: are ERDDAP, CKAN, THREDDS, (others?) all providing a similar "middleman" functionality?

I have minimal exposure to CKAN and THREDDS and want to better understand the niche of each of these relative to ERDDAP.

mwengren commented 5 years ago

@BobSimons @bbest I think both pursuits would be worthwhile. I think @BobSimons is considering general purpose ERDDAP use, and I totally agree about the value of a GeoServer 'extension' point for ERDDAP (if that's an appropriate phrase). I would love to see that happen. @bbest it sounds like your goal is to develop a purpose-built application on both of those tools, as well as CKAN for the data catalog - also a good idea, in my book! Maybe they should be separated into distinct issues though to avoid confusion.

@bbest on the architecture you showed, I would add that there is some - possibly manual - metadata configuration that happens in the middle in order for the CKAN representation to look the way it does. PacIOOS does a phenomenal job maintaining their metadata WAFs, but I don't know how much of it is a fully automated process. Either way, there would be some additional software to stitch together the TIFF-based services emanating from GeoServer and those from ERDDAP. Nonetheless, no reason it wouldn't work!

BobSimons commented 5 years ago

I view them as significantly different. Each has its own niche that makes each better for certain purposes.

THREDDS is a DAP+WMS+WCS data server (just for gridded data) that just provides 1 catalog view: a hierarchy of terms, eventually terminating in either data files or aggregated datasets at the leaves. THREDDS has its own metadata system which is separate from the dataset's metadata. So if you specify, e.g., a new summary, it appears in the THREDDS catalog for the dataset, not in the dataset's .das metadata (although there may be a way to do that separately). The THREDDS catalog has no system for full-text searches, which is why e.g., NOAA groups and others uses/used CKAN, Lucene/Solr, or ESRI GeoPortal for catalog services. For brokering, THREDDS can only point to the URLS of datasets in other THREDDS (and simplistically to other URLs), but since there are no catalog services that is no help for searching.

CKAN is a more general purpose catalog system, not a data server (in any significant way). As such, it is useful in that it can point to all kinds of data services in all kinds of remote data servers. But since the data sources are diverse and because CKAN makes no effort to broker those data services, there is no consistent way to access data from the different datasets.

ERDDAP is a dataserver (for gridded datasets (slightly more limited than THREDDS) and tabular datasets) with catalog services (full text search, faceted search, view all, and advanced search) for its datasets. ERDDAP emphasizes brokering: ERDDAP seeks to be able to connect to data from lots of remote data services (e.g., THREDDS, GRADS, DAPPER, relational databases, Cassandras, AWS S3 buckets, WFS servers, other ERDDAPs, ...) and present those datasets in exactly the same way that it presents local datasets (e.g., .nc files, ,csv files, and lots of specialized file types).

I hope that is reasonably fair. Some of that is debatable. I hope that helps.

Let me separately say: one of the goals of ERDDAP is to make aggregated datasets (not treat a bunch of files as just a bunch of files). Then, another goal of ERDDAP is to encourage better metadata for each of those datasets. (I don't think THREDDS or CKAN view it the same way.) Part of the problem is that the community's idea of "good metadata" changes over time (e.g,. new versions of CF, ACDD, and ISO 19115) so ERDDAP helps by making it easy to modify each dataset's metadata via the ERDDAP datasets.xml setup for each dataset, without editing/changing each data file (which e.g., NCEI doesn't want to do because they want to preserve the original data submission) or trying to get a remote data source to change the metadata. Then, if each dataset is aggregated and has extensive, good metadata, then: 1) Search in ERDDAP works better because there is more and better metadata available to search. 2) You can automatically create ISO 19115 metadata from the dataset's metadata (as ERDDAP does). That's an important point, because then you don't have to separately maintain the dataset's local metadata and the dataset's ISO 19115 metadata. 3) You can automatically populate other catalog systems directly or via the ISO 19115 documents. 4) You should be able to make a system to automatically connect to other data servers, e.g., GeoServer.

On Mon, Mar 18, 2019 at 1:03 PM Tylar notifications@github.com wrote:

Apologies for hijacking this thread with a tangent, but: are ERDDAP, CKAN, THREDDS, (others?) all providing a similar "middleman" functionality?

I have minimal exposure to CKAN and THREDDS and want to better understand the niche of each of these relative to ERDDAP.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BobSimons/erddap/issues/24#issuecomment-474078755, or mute the thread https://github.com/notifications/unsubscribe-auth/ABarOE5YlV8CKMnrbchPb_-bkixLCFloks5vX_EGgaJpZM4apFpZ .

-- Sincerely,

Bob Simons IT Specialist Environmental Research Division NOAA Southwest Fisheries Science Center 99 Pacific St., Suite 255A Monterey, CA 93940 Phone: (831)333-9878 Fax: (831)648-8440 Email: bob.simons@noaa.gov

The opinions in this message are mine personally and do not necessarily reflect any position of the U.S. Government or the National Oceanic and Atmospheric Administration. <>< <>< <>< <>< <>< <>< <>< <>< <><

7yl4r commented 5 years ago

Many thanks for the detailed response. Following this discussion has been a great help for me to better understand the relationships between these softwares.

MarcoAlbaETT commented 4 years ago

Hi! If you want to use EPSG:3857 and other projections, I've succesfully tested ncWMS (https://reading-escience-centre.github.io/ncwms/) as a WMS server to serve gridded data from ERDDAP server. As an example, using the dataset "jplAmsreSstMon_LonPM180" and using the URL "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplAmsreSstMon_LonPM180" for configuring the ncWMS dataset the result are (just same sample): for EPSG:4326 (from ncWMS GODIVA client) image

and for the North Polar stereographic - EPSG:5041 (from ncWMS GODIVA client): image

and for Antarctic Polas Stereographic - EPSG:3031 (from a web page using LeafletJS): image

Hope this ca nbe useful.

Ciao

Marco

bbest commented 4 years ago

Awesome, thanks so much for sharing @MarcoAlbaETT! Look forward to giving ncWMS a spin