woodRock / psychic-invention

NZODN Data Ingestion Project
0 stars 0 forks source link

Datastream for Oceanographic #30

Open woodRock opened 3 years ago

woodRock commented 3 years ago

Goal

Create a new data stream for the Oceanographic data.

Tasklist

Success Criteria

There is a new datastream that works for the Oceanographic data.

woodRock commented 3 years ago

Added a collection for the oceanographic data stream.

insert into collection (id,name, description) values (4,'oceanographic','NIWA oceanographic CTD data set');
woodRock commented 3 years ago

Changed the collection name within the loadOceanographic.sh script from mooring to oceanographic.

woodRock commented 3 years ago

Changed prod.ini file to accept files of the type from *.DAT3 to *.dat2. This appears to be the new file type for data in these ingestion streams.

woodRock commented 3 years ago

Copied the SQL script from @Glenn's data-ingestors. But modified it to match the format of the new dataset. It has different fields, so these needed to be changed appropriately.

This script was added as a function that calls the PGPASSWORD environment. I passed the contents for that SQL script into a transaction within the function. This extracts the data from the datasource and data tables, then creates the relevant materialized views that are used by GeoServer, each time new data is added.

woodRock commented 3 years ago

I have verified that this script works, by removing the existing files from both the FTP and PostgreSQL database. Then running the entire ingestion process again. The files show up in the correct directory /data/niwa/publish/oceanographic/. And are present on the database - in the datasource and data tables, as well as the oceanographic_data and oceanographic_map materialized views.

woodRock commented 3 years ago

Once #31 is fixed, we will be able to test if the data stream has been ingested correctly to NZODN.

woodRock commented 3 years ago

Changed the collection name within the loadOceanographic.sh script from mooring to oceanographic.

Edited the script to only update the materialized views once per ingestion at the very end. Rather than once for each datafile. This avoid unnecessary repetition. The recreateMaterializedViews() function is now only called once within the loadAllOceanographicData.sh, after all the database population has been completed.