erddapy questions for Glider DAC notebook

emiliom commented 6 years ago

@ocefpaf, some questions for you about erddapy and its use in the Glider DAC notebook:

It looks like there are two types of spatial and temporal constraints. Can you confirm and elaborate?
1. Searching across datasets. See cells 6-7. It uses parameters min_lat, max_lat, min_time, max_time, etc. It returns dataset records, and these records don't appear to include the spatial and temporal bounding boxes for each dataset; or is there a way to alter the query in cell 7 to get back these bounding boxes?
  - Is this catalog-style search analogous to a CSW "within" or "contains" search?
2. Searching within a specified dataset. See cell 23, where temporal criteria are used using the parameter syntax time<= and time>=.
distinct() parameter: Does erddapy have "native" way of specifying this request parameter, other than simply adding "&distinct()" to the request url?
In the request in cells 18-19, response = 'mat' is specified. But then why is skiprows=(1,) used in the e.to_pandas() statement in cell 19? Isn't skiprows only used for csv reads? I don't see how it fits with reading a mat file. Yet, when I remove it, I see that the e.to_pandas() statement produces a bad DataFrame, where the first data row is the units attributes. I don't understand how this can happen with a mat file, unless ERDDAP creates a bad mat file with units attributes mixed in with the data values.

ocefpaf commented 6 years ago

It looks like there are two types of spatial and temporal constraints. Can you confirm and elaborate?

erddapy does not implement any constraint actually, it only builds and validates the URL. All of those are ERDDAP's logic. The search is standardized but the rest are based on the variables available in the dataset

distinct() parameter: Does erddapy have "native" way of specifying this request parameter, other than simply adding "&distinct()" to the request url?

I'm not sure what that is but if that is useful we can add a keyword to add that.

In the request in cells 18-19, response = 'mat' is specified. But then why is skiprows=(1,) used in the e.to_pandas() statement in cell 19?

The .mat is part of the original demo to show that all the responses are available, there is nothing to do with the to_pandas method. The .to_pandas is always a .csv value response to get a DataFrame. BTW, I'm modifying that to be .csvp, which is a trick I just learned about ERDDAP where the variables and units are in the same row.

emiliom commented 6 years ago

erddapy does not implement any constraint actually, it only builds and validates the URL. All of those are ERDDAP's logic. The search is standardized but the rest are based on the variables available in the dataset

Right! It's unfortunate that ERDDAP uses two types of query specifications depending on context. But I suppose the catalog-style type allows query parameters to be independent of the variable names for lat, lon and time.

Regarding these two points:

It returns dataset records, and these records don't appear to include the spatial and temporal bounding boxes for each dataset; or is there a way to alter the query in cell 7 to get back these bounding boxes?

It doesn't look like ERDDAP returns the lat-lon bounding box per dataset. At most, it returns fields like minLatitude and maxLatitude, but in an example I tested, they were all empty.

Is this catalog-style search analogous to a CSW "within" or "contains" search?

I would guess it's a "contains", if we interpret the min_lat, max_lat, etc as being combined with an and. But I think I'm seeing behavior that actually produces a "within" result.

The .mat is part of the original demo to show that all the responses are available, there is nothing to do with the to_pandas method. The .to_pandas is always a .csv value response to get a DataFrame. BTW, I'm modifying that to be .csvp, which is a trick I just learned about ERDDAP where the variables and units are in the same row.

Sorry, my bad. I understand now.

Thanks.

emiliom / ohw2018_tutorials

erddapy questions for Glider DAC notebook #3