ioos / ckanext-ioos-theme

IOOS Catalog as a CKAN extension
GNU Affero General Public License v3.0
7 stars 14 forks source link

Dead 'OPeNDAP' links on some records #89

Closed mwengren closed 7 years ago

mwengren commented 7 years ago

I've noticed several records, in particular from MARACOOS, that have links marked 'OPeNDAP' format but lead to 404 errors.

Can we see why the source metadata for records such as this:

https://dev-catalog.ioos.us/dataset/3-day-aggregation

are invalid?

lukecampbell commented 7 years ago

That dataset has a valid OPeNDAP link, do you have one that isn't?

mwengren commented 7 years ago

Click on the link that's labeled 'OPeNDAP'/THREDDS OPeNDAP and then 'Go to Resource'. It takes you here: http://tds.maracoos.org/thredds/dodsC/MODIS-Three-Agg.nc and give an 'unrecognized resource' error.

lukecampbell commented 7 years ago

That's correct.

Error {
    code = 400;
    message = "Unrecognized request";
};

Is the right response to a browser going to a URL for DAP clients.

lukecampbell commented 7 years ago

If you plug that URL into a proper DAP client, it works.

screen shot 2016-11-08 at 3 47 33 pm

screen shot 2016-11-08 at 3 47 37 pm

rsignell-usgs commented 7 years ago

And of course you can slap a ".html" on the end of any OPeNDAP Data URL and look at it in your browser:

http://tds.maracoos.org/thredds/dodsC/MODIS-Three-Agg.nc

http://tds.maracoos.org/thredds/dodsC/MODIS-Three-Agg.nc.html

kknee commented 7 years ago

@lukecampbell and @rsignell-usgs are quite correct that the link works in a DAP client directly and a web client if you add the .html. But, not all users are going to understand that (this was an issue in previous iterations of the catalog as I'm sure @dpsnowden remembers). Possible to append the .html if a user clicks the link?

mwengren commented 7 years ago

Ok, I see, we may want to come up with an alternative from a UX perspective. This will of course be familiar to all DAP heavyweight users, but to everyone else (like myself) it may seem an error.

Ideally, we could create an 'Open with' context item that would open the URL rather than allowing users to view it in a browser (weather/climate toolkit?), or if they did open the link, append the .html automatically.

lukecampbell commented 7 years ago

I would recommend against automatically appending .html. There's not a consistent programmatic way to ensure that the URL is pointing to a DAP endpoint. And depending on the approach we take it could break the interface to legitimate clients like CSW.

Also, this only applies to THREDDS, it does not apply to ERDDAP or Hyrax.

mwengren commented 7 years ago

Isn't there a way for the Weather & Climate Toolkit to be downloaded and run via Java web start with a DAP URL to load a remote dataset? I think we have some of those links in some of our metadata (PacIOOS I think).

We could copy something like that and provide an 'Open With' button similar to: https://catalog.data.gov/dataset/noaa-national-hurricane-center-tropical-cyclone-forecasts-wms-wfs, with a link to the WCT combined with the DAP URL. It wouldn't modify the base DAP URL in CKAN, but would provide an alternative than just browsing directly to the URL as it is now.

lukecampbell commented 7 years ago

That's something we could look at supporting.

lukecampbell commented 7 years ago

I would like to comment that it's generally ill advisable to support embedded java applications or java applets that launch in the browser from a security standpoint. Especially from applets that download JARs from a cross-domain site.

But, we can support it.

lukecampbell commented 7 years ago

Actually, I take that back, looking at the package, it doesn't look like it's built to run through the web like that. It's a standalone desktop application.

lukecampbell commented 7 years ago

I take that back, I found the jnlp... I'm completely unsure for how to make it work though.

lukecampbell commented 7 years ago

Yeah this still won't work. Browsers will only let you download JNLP packages, and won't let you automatically download and LAUNCH binaries, that would be security suicide. Even if the user has the java application, there's no way to tell the browser that "hey use this application to open this link". That would also be security suicide, cause I can write virus and tell your computer to open it with /bin/bash or something.

There is a concept of content-type and content-disposition which aids the operating system and browser to deciding what to with the content. This is how your browser knows not to open the PDF file but to download it.

But, these apply to the HTTP server, which CKAN has no interface with whatsoever. There's no way, that I can see, to support this.

mwengren commented 7 years ago

I fairly sure I saw some links in some PacIOOS metadata that implied something along these lines, but I don't remember it actually working when I tried. Can't dig up any examples at the moment either. Maybe it's just not possible.

rsignell-usgs commented 7 years ago

Shall we follow the CKAN/pycsw of data.gov?

https://catalog.data.gov/dataset/nos-co-ops-meteorological-data-rain-fall-6-minute

2016-11-09_12-58-57

lukecampbell commented 7 years ago

@rsignell-usgs that's for ERDDAP, which recognizes the DAP request from a browser and redirects to the TableDAP form. Do you have an example of a OPeNDAP source that's not from ERDDAP?

lukecampbell commented 7 years ago

For example:

(cchecker)➜  compliance-checker git:(fvcom) ✗ curl -i http://coastwatch.pfeg.noaa.gov/erddap/tabledap/nosCoopsMRF
HTTP/1.1 302 Found
Date: Wed, 09 Nov 2016 18:14:20 GMT
Server: Apache-Coyote/1.1
Last-Modified: Wed, 09 Nov 2016 18:14:20 GMT
xdods-server: dods/3.7
erddap-server: 1.74
Location: http://coastwatch.pfeg.noaa.gov/erddap/tabledap/nosCoopsMRF.html
Content-Length: 0
Connection: close
Content-Type: text/plain; charset=UTF-8

But if you make a request in DAP:

curl -i http://coastwatch.pfeg.noaa.gov/erddap/tabledap/nosCoopsMRF.das
HTTP/1.1 200 OK
Date: Wed, 09 Nov 2016 18:14:59 GMT
Server: Apache-Coyote/1.1
Last-Modified: Wed, 09 Nov 2016 18:14:59 GMT
xdods-server: dods/3.7
erddap-server: 1.74
content-description: dods_das
Content-Encoding:
Content-Type: text/plain;charset=ISO-8859-1
Vary: Accept-Encoding
Connection: close
Transfer-Encoding: chunked

Attributes {
 s {
  stationID {
    String cf_role "timeseries_id";
    String comment "Queries for data MUST include \"stationID=\".";
    String ioos_category "Identifier";
    String long_name "Station ID";
  }
...
lukecampbell commented 7 years ago

We have that too, btw.

https://dev-catalog.ioos.us/dataset/clark-20130821t21301

If you click go to resource, it produces the same behavior.

rsignell-usgs commented 7 years ago

How about this one? https://catalog.data.gov/dataset/hybrid-coordinate-ocean-model-hycom-global 2016-11-09_14-35-08

lukecampbell commented 7 years ago

Yeah, they're doing the same thing we are:

http://oos.soest.hawaii.edu/thredds/dodsC/pacioos/hycom/global

rsignell-usgs commented 7 years ago

So we are at least consistent, which seems good, right?

lukecampbell commented 7 years ago

So we are at least consistent, which seems good, right?

Oh yeah absolutely!

I have some python code, that can probably do something like a data preview, if it's desirable.

lukecampbell commented 7 years ago

If it's ok, I'd like to remove this from the milestone since this is growing into a discussion more than a bug or feature.