aodn / content

Tracks AODN Portal content and configuration issues
0 stars 0 forks source link

Content issues identified through data reports #290

Open xhoenner opened 7 years ago

xhoenner commented 7 years ago

Will update this issue as I review the summary data reports on a monthly basis, refer to the latest set of reports for more details, in /Volumes/research/IMOS/emiiSheryl/eMII_data_report/Data_Reports.

@jachope, @smancini, @mhidas, @ggalibert, @bpasquer, @lbesnard feel free to add stuff in as you see fit. Note though that improvements to data reports will be listed here.

ggalibert commented 7 years ago
  • ACORN: null station code for radials. Misspelling of Turquoise Coast site code

Cannot find any trace of a radial with platform_code set to null.

xhoenner commented 7 years ago

@ggalibert There's an inconsistency in the site_code field as demonstrated by the output of the following query:

select distinct site_code FROM acorn_radial_nonqc.acorn_radial_nonqc_timeseries_url
site_code
"TURQ, Turquoise Coast"
"TURQ"
"CBG"
"SAG"
"ROT"
"BONC"
"COF"
ggalibert commented 7 years ago

This inconsistency: "TURQ, Turquoise Coast", has been introduced the 2017-06-01 by ACORN UWA when they started to produce the SeaSonde radial FV00 netCDF files themselves (without passing through SSR on our RT processing server). Have written to Simone and Andrew to see if they could update their script. If so then I would manually correct the historical inconsistent files.

ggalibert commented 7 years ago

Other inconsistencies I'm putting there for the record:

select distinct site_code
from acorn_hourly_avg_qc.acorn_hourly_avg_qc_timeseries_url
site_code
"ROT, Rottnest Shelf"
"SAG, South Australian Gulf"
"SAG, South Australia Gulf"
"BONC, Bonney Coast"
"TURQ, Turqoise Coast"
"COF, Coffs Harbour"
"CBG, Capricorn Bunker Group"
select distinct site_code
from acorn_hourly_avg_nonqc.acorn_hourly_avg_nonqc_timeseries_url
site_code
"ROT, Rottnest Shelf"
"TURQ, Turqoise Coast"
"BONC, Bonney Coast"
"SAG, South Australia Gulf"
"COF, Coffs Harbour"
"CBG, Capricorn Bunker Group"
select distinct site_code
from acorn_radial_qc.acorn_radial_qc_timeseries_url
site_code
"CBG"
"ROT"
"SAG"
"COF , Coffs Harbour (NSW), North Nambucca (NSW)"
"COF , Coffs Harbour (NSW), Red Rock (NSW)"
"COF"

Would that be affecting your reporting as well?

xhoenner commented 7 years ago

Would that be affecting your reporting as well?

Thanks for digging in a bit deeper @ggalibert. Yes any inconsistency in the site_code field will have an impact on the data reports, see for example the 'Hourly Vectors - QC' section in /Volumes/research/IMOS/emiiSheryl/eMII_data_report/Data_Reports/2017/Aug_2017/Ocean_Radar/OceanRadar_hourlyVectors_Summary.pdf

ggalibert commented 7 years ago

ACORN UWA has updated their scripts and I have just fixed this inconsistency: "TURQ, Turquoise Coast" instead of "TURQ" for historical SeaSonde radial FV00 netCDF files.

select distinct site_code FROM acorn_radial_nonqc.acorn_radial_nonqc_timeseries_url
site_code
"TURQ"
"CBG"
"SAG"
"ROT"
"BONC"
"COF"
ggalibert commented 7 years ago

Also, fixed this inconsistency: "SAG, South Australia Gulf" instead of "SAG, South Australian Gulf".

select distinct site_code
from acorn_hourly_avg_qc.acorn_hourly_avg_qc_timeseries_url
site_code
"ROT, Rottnest Shelf"
"SAG, South Australian Gulf"
"BONC, Bonney Coast"
"TURQ, Turqoise Coast"
"COF, Coffs Harbour"
"CBG, Capricorn Bunker Group"
xhoenner commented 7 years ago

Thanks @ggalibert for looking into this. Still some issues with the following:

1. acorn_radial_qc

select distinct site_code, platform_code from acorn_radial_qc.acorn_radial_qc_timeseries_url order by site_code, platform_code;

which returns "COF, Coffs Harbour (NSW), North Nambucca (NSW)" instead of "COF" for platform_code 'NNB' and 'RRK'.

2. acorn_hourly_avg_nonqc

select distinct site_code from acorn_hourly_avg_nonqc.acorn_hourly_avg_nonqc_timeseries_url

which returns "TURQ, Turqoise Coast" instead of "TURQ, Turquoise Coast" and "SAG, South Australia Gulf" instead of "SAG, South Australian Gulf"

3. acorn_hourly_avg_qc

select distinct site_code from acorn_hourly_avg_qc.acorn_hourly_avg_qc_timeseries_url

which returns "TURQ, Turqoise Coast" instead of "TURQ, Turquoise Coast"

ggalibert commented 7 years ago

@xhoenner

  1. getting there, currently uploading corrected files. There are 92 076 of them and it takes 5 sec to harvest 1 file...

  2. and 3. These are less priority / annoying problems. Basically just a typo which at least is consistent within the whole collection.

ggalibert commented 7 years ago

1. acorn_radial_qc fixed:

select distinct site_code, platform_code from acorn_radial_qc.acorn_radial_qc_timeseries_url order by site_code, platform_code;
site_code platform_code
"CBG" "LEI"
"CBG" "TAN"
"COF" "NNB"
"COF" "RRK"
"ROT" "FRE"
"ROT" "GUI"
"SAG" "CSP"
"SAG" "CWI"
leonardolaiolo commented 3 years ago

I started investigating this and I can see in DB (acorn_radial_qc shema) that there are 6 new site_code than the ones reported in the SQL query. I believe 2 of them (CORL and NEWC) are new sites and should be added in the SQL, as they are in DB and in geonetwork consistently registered as sites. I do no know what to do with the following site_code:

- COF, Coffs Harbour (NSW), Red Rock (NSW)
- COF, Coffs Harbour (NSW), North Nambucca (NSW)
- NWS
- “NWS”

@ggalibert I'm not sure what to do with the 4 site_code above as I cannot see them in the metadata registered as sites. Are these error in DB? I don't know much of these data but could be that this list above of site_code are actually platform_code? Here the query to check all site_code in DB:

select distinct site_code FROM acorn_radial_nonqc.acorn_radial_nonqc_timeseries_url
Here a table to summarise all site_code found in DB and what I found in Geonetwork: site_code geonetwork
TURQ Turquoise Coast (TURQ)
SAG South Australia Gulfs (SAG)
“NWS” ???
COF, Coffs Harbour (NSW), Red Rock (NSW) ???
COF Coffs Harbour (COF)
CORL Coral Coast (CORL)
CBG Capricorn Bunker Group (CBG)
ROT Rottnest Shelf (ROT)
COF, Coffs Harbour (NSW), North Nambucca (NSW) ???
NWS ???
BONC Bonney Coast (BONC)
NEWC Newcastle (NEWC)
atkinsn commented 3 years ago

NWS = NWA: Northwest Shelf