HakaiInstitute / cde

https://explore.cioos.ca
0 stars 1 forks source link

Datasets wall of shame #166

Closed n-a-t-e closed 2 years ago

n-a-t-e commented 3 years ago

I shared this with the other ERDDAP admins in CIOOS (in a friendly way, I didn't call it "wall of shame")

CIOOS Datasets and how they could be improved for CIOOS Data Explorer

For information on using add preciseLat/Lon, see https://coastwatch.pfeg.noaa.gov/erddap/download/setupDatasetsXml.html#cdmTimeSeries

hakai

https://catalogue.hakai.org/erddap/tabledap/HakaiColumbiaFerryResearch.html Trajectory (not supported)

https://catalogue.hakai.org/erddap/tabledap/HakaiKetchikanBoL5min.html no data

https://catalogue.hakai.org/erddap/tabledap/HakaiSewardBoL5min.html no data

https://catalogue.hakai.org/erddap/tabledap/HakaiSitkaBoL5min.html no data

cioosatlantic

https://cioosatlantic.ca/erddap/tabledap/9qw2-yb2f.html Point (could be TimeSeries and use timeseries_id=waterbody_station)

https://cioosatlantic.ca/erddap/tabledap/a9za-3t63.html Point (could be TimeSeries and use timeseries_id=waterbody_station)

https://cioosatlantic.ca/erddap/tabledap/adpu-nyt8.html Point (could be TimeSeries and use timeseries_id=waterbody_station)

https://cioosatlantic.ca/erddap/tabledap/cmar_8f10_9c65_13cb.html Point (could be TimeSeries, timeseries_id=buoy_name or buoy_id)

https://cioosatlantic.ca/erddap/tabledap/cmar_c5a5_c41c_2090.html Point (could be TimeSeries, timeseries_id=buoy_name or buoy_id)

https://cioosatlantic.ca/erddap/tabledap/cmar_fca0_698a_0716.html Point (could be TimeSeries, timeseries_id=buoy_name or buoy_id)

https://cioosatlantic.ca/erddap/tabledap/coast-of-bays-hydrographic-2009-2013.html Point (could be TimeSeries, timeseries_id=station, or TimeSeriesProfile with event as profile_id?)

https://cioosatlantic.ca/erddap/tabledap/coastal_action_7471_9460_c025.html Point (could be TimeSeries, timeseries_id=location)

https://cioosatlantic.ca/erddap/tabledap/eb3n-uxcb.html Point (could be TimeSeries and use timeseries_id=waterbody_station, add preciseLat/Lon)

https://cioosatlantic.ca/erddap/tabledap/eda5-aubu.html Point (could be TimeSeries and use timeseries_id=waterbody_station, add preciseLat/Lon)

https://cioosatlantic.ca/erddap/tabledap/knwz-4bap.html Point (could be TimeSeries and use timeseries_id=waterbody_station, add preciseLat/Lon)

https://cioosatlantic.ca/erddap/tabledap/mq2k-54s4.html Point (could be TimeSeries and use timeseries_id=waterbody_station, add preciseLat/Lon)

https://cioosatlantic.ca/erddap/tabledap/v6sa-tiit.html Point (could be TimeSeries and use timeseries_id=waterbody_station, add preciseLat/Lon)

https://cioosatlantic.ca/erddap/tabledap/wpsu-7fer.html Point (could be TimeSeries and use timeseries_id=waterbody_station, add preciseLat/Lon)

https://cioosatlantic.ca/erddap/tabledap/x9dy-aai9.html Point (could be TimeSeries and use timeseries_id=waterbody_station, add preciseLat/Lon)

https://cioosatlantic.ca/erddap/tabledap/NL_Climate_Index_all_fields.html Other (not supported)

https://cioosatlantic.ca/erddap/tabledap/NL_Climate_Index_natrual_signs.html Other (not supported)

https://cioosatlantic.ca/erddap/tabledap/NL_Climate_Index.html Other (not supported)

cioospacific

https://data.cioospacific.ca/erddap/tabledap/DFO_MEDS_BUOYS.html Point (could be TimeSeries and use add preciseLat/Lon)

https://data.cioospacific.ca/erddap/tabledap/DFO_OPP_BUOYS.html Point (could be TimeSeries and use add preciseLat/Lon)

https://data.cioospacific.ca/erddap/tabledap/IOS_P26_Annualized.html Other (not supported)

https://data.cioospacific.ca/erddap/tabledap/IYS_2019_nutrients_O2.html Point (could be TimeSeries and use timeseries_id=Station)

https://data.cioospacific.ca/erddap/tabledap/IYS_2019_POM.html Point (could be TimeSeries and use timeseries_id=station)

https://data.cioospacific.ca/erddap/tabledap/IYS_NISKIN_chl_phaeo.html Point (could be TimeSeries and use timeseries_id=station)

ogsl

https://erddap.ogsl.ca/erddap/tabledap/binned_8f48_5f3e_04f3.html Point (could be TimeSeries, timeseries_id=station_id)

https://erddap.ogsl.ca/erddap/tabledap/in-situ_8f48_5f3e_04f3.html Point (could be TimeSeries, timeseries_id=station_id)

https://erddap.ogsl.ca/erddap/tabledap/tabular_nc_0a89_74aa_5dea.html Point (could be TimeSeries, timeseries_id=station_id)

https://erddap.ogsl.ca/erddap/tabledap/data_a451_b757_99f1.html Other (not supported)

smartatlantic

https://www.smartatlantic.ca/erddap/tabledap/cmar_cioos_299_jordan_bay.html Point (could be TimeSeries, timeseries_id=buoy_name or buoy_id)

https://www.smartatlantic.ca/erddap/tabledap/cmar_cioos_333_chedabucto_bay.html Point (could be TimeSeries, timeseries_id=buoy_name or buoy_id)

https://www.smartatlantic.ca/erddap/tabledap/cmar_cioos_334_st_marys_bay.html Point (could be TimeSeries, timeseries_id=buoy_name or buoy_id)

https://www.smartatlantic.ca/erddap/tabledap/DFO_Sutron_POOLC.html station_name should be constant

The following smartatlantic datasets either don't have ocean data (land stations), or are missing or have the wrong standard_names set, so CEDA cant find any supported variables:

eccc_opp_44488_east_chedabucto_bay, eccc_opp_44489_west_chedabucto_bay, eccc_opp_44490_west_bay_of_fundy, SMA_Fortune_Bay_Buoy, DFO_Sutron_DOGIS, SMA_halifax, SMA_halifax_fairview, SMA_halifax_anemometer1, SMA_head_of_placentia_bay, SMA_Holyrood_Buoy2, SMA_holyrood_wharf, SMA_manolis_buoy, eccc_opp_atlantic, mun_glider_nunkaysa_pacific_2012, mun_glider_pearldiver_north_gulf_of_st_lawrence_2019, mun_glider_data_pearldiver_labrador_sea_2019, mun_glider_pearldiver_labrador_shelf_2014, mun_glider_scidaana_pacific_2012, mun_glider_unit_048_fortune_bay_2012, mun_glider_unit_049_newfoundland_shelf_2006, mun_glider_unit_334_labrador_shelf_2014, mun_glider_unit_472_trinity_bay_2014, mun_glider_unit_472_trinity_bay_2015, mun_glider_unit_473_labrador_sea_2016, mun_glider_unit_473_labrador_shelf_2014, mun_glider_unit_473_trinity_bay_2014, mun_glider_unit_473_trinity_bay_2016, mun_glider_unit_473_trinity_bay_2018, sma_negl_black_tickle_nlqu0003, sma_negl_cartwright_junction_nlqu0004, sma_negl_north_west_river_nlqu0007, sma_negl_postville_nlqu0001, sma_negl_red_bay_nlqu0005, sma_negl_rigolet_nlqu0002, SMA_red_island_shoal, DFO_Sutron_KLUMI, SMA_saint_john, SMA_st_johns, SmartAtlantic_XEOS_hk4_buoy, SmartAtlantic_XEOS_hkb_buoy

n-a-t-e commented 3 years ago

I'm not sure whether to recommend adding preciseLat/Lon variables. It is best practise but I don't think it affects CEDA

raytula commented 3 years ago

Re: the real-time/provisional BoL datasets should be excluded from the Data Explorer. I believe there are 5 such datasets, including the three you listed (e.g. Quadra and Baynes Sound are missing). The three you listed are from sites that have been offline for quite a while and not able to be serviced due to COVID and/or other issues. However, even when back online, the 60 days of data shared via these datasets is really only intended to serve dashboards like the Baynes Sound monitor and AOOS/NANOOS web pages (the two US IOOS RAs north and south of BC). Wiley has been pretty clear that this data should not be downloadable, as it's relatively poor quality and I believe there have been examples of mis-use.

n-a-t-e commented 3 years ago

Good to know, I think we will need to add a flag (erddap dataset global) like IOOS does with gts_ingest, could be something like cde_ingest or cioos_ingest

n-a-t-e commented 2 years ago

After taking in all the GOOS EOVs, and more standard names getting set, the list of 'bad' datasets is getting smaller:

ONC:

SLGO:

CIOOS Atlantic

Smartatlantic

DFO_Sutron_NHARB - station name shouldn't have date in it

Missing standard names:

n-a-t-e commented 2 years ago

This is mostly out of date