emiliom / ohw2018_tutorials

Tutorials for Oceanhackweek 2018
0 stars 0 forks source link

erddapy questions for Glider DAC notebook #3

Open emiliom opened 6 years ago

emiliom commented 6 years ago

@ocefpaf, some questions for you about erddapy and its use in the Glider DAC notebook:

ocefpaf commented 6 years ago

It looks like there are two types of spatial and temporal constraints. Can you confirm and elaborate?

erddapy does not implement any constraint actually, it only builds and validates the URL. All of those are ERDDAP's logic. The search is standardized but the rest are based on the variables available in the dataset

distinct() parameter: Does erddapy have "native" way of specifying this request parameter, other than simply adding "&distinct()" to the request url?

I'm not sure what that is but if that is useful we can add a keyword to add that.

In the request in cells 18-19, response = 'mat' is specified. But then why is skiprows=(1,) used in the e.to_pandas() statement in cell 19?

The .mat is part of the original demo to show that all the responses are available, there is nothing to do with the to_pandas method. The .to_pandas is always a .csv value response to get a DataFrame. BTW, I'm modifying that to be .csvp, which is a trick I just learned about ERDDAP where the variables and units are in the same row.

emiliom commented 6 years ago

erddapy does not implement any constraint actually, it only builds and validates the URL. All of those are ERDDAP's logic. The search is standardized but the rest are based on the variables available in the dataset

Right! It's unfortunate that ERDDAP uses two types of query specifications depending on context. But I suppose the catalog-style type allows query parameters to be independent of the variable names for lat, lon and time.

Regarding these two points:

It returns dataset records, and these records don't appear to include the spatial and temporal bounding boxes for each dataset; or is there a way to alter the query in cell 7 to get back these bounding boxes?

It doesn't look like ERDDAP returns the lat-lon bounding box per dataset. At most, it returns fields like minLatitude and maxLatitude, but in an example I tested, they were all empty.

Is this catalog-style search analogous to a CSW "within" or "contains" search?

I would guess it's a "contains", if we interpret the min_lat, max_lat, etc as being combined with an and. But I think I'm seeing behavior that actually produces a "within" result.

The .mat is part of the original demo to show that all the responses are available, there is nothing to do with the to_pandas method. The .to_pandas is always a .csv value response to get a DataFrame. BTW, I'm modifying that to be .csvp, which is a trick I just learned about ERDDAP where the variables and units are in the same row.

Sorry, my bad. I understand now.

Thanks.