Open rsignell-usgs opened 8 years ago
Yikes, this is even stranger. There are two different dimensions, both called t
:
gives
Dataset {
Float64 Longitude[lon = 39];
Float64 Latitude[lat = 36];
Int32 t[t = 48];
Float64 datetime[t = 4416];
Float64 East_vel[t = 4416][lat = 36][lon = 39];
Float64 North_vel[t = 4416][lat = 36][lon = 39];
Float64 East_err[t = 4416][lat = 36][lon = 39];
Float64 North_err[t = 4416][lat = 36][lon = 39];
Float64 err_cov[t = 4416][lat = 36][lon = 39];
Float64 total_err[t = 4416][lat = 36][lon = 39];
} usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/00_dir_HFR_agg.ncml;
If I try renaming the dimension
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" >
<dimension name="datetime" orgName="t"/>
<aggregation dimName="t" type="joinExisting">
<scan location="." regExp="^WHOI_ISLE_HFR_[0-9]{4}_[0-9]{2}_[0-9]{2}_800mgrid_1000mrad_20-Feb-2016\.nc$"/>
</aggregation>
</netcdf>
then the DDS looks better, but still I have that strange t
variable with it's own t
dimension:
http://geoport-dev.whoi.edu/thredds/dodsC/usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/00_dir_HFR_agg2.ncml.dds
Dataset {
Float64 Longitude[lon = 39];
Float64 Latitude[lat = 36];
Int32 t[t = 48];
Float64 datetime[datetime = 4416];
Float64 East_vel[datetime = 4416][lat = 36][lon = 39];
Float64 North_vel[datetime = 4416][lat = 36][lon = 39];
Float64 East_err[datetime = 4416][lat = 36][lon = 39];
Float64 North_err[datetime = 4416][lat = 36][lon = 39];
Float64 err_cov[datetime = 4416][lat = 36][lon = 39];
Float64 total_err[datetime = 4416][lat = 36][lon = 39];
} usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/00_dir_HFR_agg2.ncml;
And it looks like renaming the dimension causes failure. Godiva2 gives
http://geoport-dev.whoi.edu/thredds/godiva2/godiva2.html?server=http://geoport-dev.whoi.edu/thredds/wms/usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/00_dir_HFR_agg2.ncml
throws a
error getting data from server
while if I don't rename, the aggregation is okay:
http://geoport-dev.whoi.edu/thredds/godiva2/godiva2.html?server=http://geoport-dev.whoi.edu/thredds/wms/usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/00_dir_HFR_agg.ncml
@cwardgar , should I send e-mail to thredds support referencing this ticket? Not sure of the protocol anymore...
No. Check your files. I just dumped them all via opendap and one actually HAS a variable called 't'. (I'll get filename in a second.)
Might have spoken too soon... (stupid ncml files also get opened by opendap...)
@rsignell-usgs - I think github works best for potential bugs like this. Can you try renaming the dimension inside the aggregation? That worked for me using a few of the files from the server.
According to the ncml agg docs:
https://www.unidata.ucar.edu/software/thredds/v4.6/netcdf-java/ncml/Aggregation.html
"Variables of the same name (in different files) are connected along their existing, outer dimension, called the aggregation dimension. A coordinate variable must exist for the dimension."
So, in the example you have above renaming the dimension, the coordinate variable t
is being created for each file, and then you rename the dimension overall. If you rename the dimension inside the aggregation, the the variable datetime
is recognized as the coordinate variable and no new variable t
is created.
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" >
<aggregation dimName="datetime" type="joinExisting">
<scan location="." regExp="^WHOI_ISLE_HFR_[0-9]{4}_[0-9]{2}_[0-9]{2}_800mgrid_1000mrad_20-Feb-2016\.nc$"/>
<dimension name="datetime" orgName="t" />
</aggregation>
</netcdf>
Now here is a fun one...if I tell the joinExisting to use dimName="datetime"
instead of dimName="t"
and change nothing else, like so:
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" >
<aggregation dimName="datetime" type="joinExisting">
<scan location="." regExp="^WHOI_ISLE_HFR_[0-9]{4}_[0-9]{2}_[0-9]{2}_800mgrid_1000mrad_20-Feb-2016\.nc$"/>
</aggregation>
</netcdf>
then things work as well. Since the dimension datetime
does not exists but the variable does, the ncml agg creates the new dimension...I don't think it should be doing that!
In short, I think this is a bug.
Here is what I think might be going on: even though the variable datetime
is a coordinate variable, the NCML aggregation code does not pick up the variable datetime
as the coordinate variable corresponding to the dimension t
, and as such and creates a new variable t
to match the name of the dimension t
.
@JohnLCaron - any of this ringing a bell, or brining back memories of NcML aggregation nightmares?
get rid of
doesnt need to have same name,
:coordinates = "Longitude Latitude datetime";
works fine
not sure what this "variable t that didn't exist before" is yet. so i may be wrong, we may be assuming existence or coordinate variable.
if so, try
<variable name="t" orgName="datetime" />
not
<dimension name="datetime" orgName="t" />
@lesserwhirls , awesome! I didn't know I could rename the dimension inside the aggregation tag! And I agree that creating a time coordinate variable with the same name as the dimension is a bug, since one already exists (it just isn't named the same as the dimension).
Here's the resulting very nice aggregation, using @lesserwhirls https://github.com/Unidata/thredds/issues/451#issuecomment-189032093 solution above:
@lesserwhirls should we leave this open until the bug is fixed or do you want to introduce another issue that actually more closely addresses the issue?
I think we should just leave this open, and I will try to summarize things. However, it looks like @JohnLCaron had something slightly different in mind (rather than renaming the dimension), but I'm not sure if there is a difference between renaming the dimension or renaming the variable.
So @JohnLCaron, here is what I understand the situation is:
Each netCDF file has a dimension t
and an associated coordinate variable datetime
, which is correctly picked up by the CoordSys tab in ToolsUI as a coordinate variable. When you do a joinExisting NcML agg, the aggregation creates a new variable t
, with what appears to be a default value set for all values in the array. I assume this is done to match the dimension t
, even though the (dimension <--->
coordinate variable) pair is t
and datetime
. Note that the docs for the joinExisting NcML agg state that we assume a coordinate variable for the joinExisting dimension exists.
I'm thinking that the NcML agg does not pick up on the fact that the (dimension <--->
coordinate variable ) pair is t
and datetime
, and thus it does not need to create a new variable t
. To me, this indicates a bug in that the joinExisting agg is actually requiring that a variable with the same name as the join dimension exits, rather than a corresponding coordinate variable exists for the join dimension (as stated in the docs). If we rename the dimension t
to datetime
, or rename the variable datetime
to t
, things work as expected.
@lesserwhirls this is exactly how I understand the situation as well. :smile_cat:
agree
On Fri, Feb 26, 2016 at 9:35 AM, Rich Signell notifications@github.com wrote:
@lesserwhirls https://github.com/lesserwhirls this is exactly how I understand the situation as well. [image: :smile_cat:]
— Reply to this email directly or view it on GitHub https://github.com/Unidata/thredds/issues/451#issuecomment-189361458.
We have a bunch of netcdf granules here: http://geoport-dev.whoi.edu/thredds/catalog/usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/catalog.html
that we are aggregating with a very simple NcML that joins along the time dimension
t
:The resulting aggregation dataset here: http://geoport-dev.whoi.edu/thredds/dodsC/usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/00_dir_HFR_agg.ncml.html seems to work fine, but we noticed that the aggregation has acquired an odd new variable
t
that didn't exist before.This new variable
t
has some rather strange values: http://geoport-dev.whoi.edu/thredds/dodsC/usgs/data2/rsignell/gdrive/nsf-alpha/Data/WHOI-HFRadar-Data-Sets/00_dir_HFR_agg.ncml.ascii?t[0:1:47]Is this because the time coordinate variable
datetime
has a different name than the time dimensiont
?Is this expected behavior?