aodn / harvesters

Harvesters
GNU General Public License v3.0
0 stars 0 forks source link

Talend data bag keys #154

Closed jkburges closed 8 years ago

jkburges commented 9 years ago

I'm noting the list of unique data bag keys in the chef/talend directory, w.r.t. https://github.com/aodn/harvesters/wiki/Proposed-deployment-process-changes-for-talend-harvesters

$ cd ...data_bags/talend
$ cat * | egrep "id|artifact_id|params|{|}" -v | awk -F'"' '{ print $2 }' | sort | uniq -i
DMSchema
DestinationDB_Login
DestinationDB_Password
DestinationDB_Port
DestinationDB_Schema
DestinationDatabase_Login
DestinationDatabase_Password
DestinationDatabase_Port
DestinationDatabase_Schema
DestinationDb_Login
DestinationDb_Password
DestinationDb_Port
DestinationDb_Schema
Destination_Login
Destination_Password
Destination_Port
Destination_Schema
HarvestDb_Login
HarvestDb_Password
HarvestDb_Port
HarvestDb_Schema
InventoryDb_Login
InventoryDb_Password
InventoryDb_Port
InventoryDb_Schema
LegacyDb_Login
LegacyDb_Password
LegacyDb_Port
LegacyDb_Schema
LegacyDb_Server
LegacySchema
Metadata_SpatialColumn
Metadata_SpatialColumn_Argos
Metadata_SpatialColumn_Dives
Metadata_SpatialColumn_Haulout
Metadata_SpatialColumn_NRT
Metadata_SpatialColumn_Summary
Metadata_SpatialColumn_dm
Metadata_SpatialColumn_rt
Metadata_SpatialResolution
Metadata_SpatialResolution_Argos
Metadata_SpatialResolution_Dives
Metadata_SpatialResolution_Haulout
Metadata_SpatialResolution_NRT
Metadata_SpatialResolution_Summary
Metadata_SpatialResolution_dm
Metadata_SpatialResolution_rt
Metadata_SpatialTable
Metadata_SpatialTable_Argos
Metadata_SpatialTable_Dives
Metadata_SpatialTable_Haulout
Metadata_SpatialTable_NRT
Metadata_SpatialTable_Summary
Metadata_SpatialTable_biomass
Metadata_SpatialTable_chemistry
Metadata_SpatialTable_dm
Metadata_SpatialTable_phypig
Metadata_SpatialTable_phytoplankton
Metadata_SpatialTable_picoplankton
Metadata_SpatialTable_rt
Metadata_SpatialTable_tss
Metadata_SpatialTable_zooplankton
Metadata_UUID
Metadata_UUID_Argos
Metadata_UUID_Dives
Metadata_UUID_Haulout
Metadata_UUID_NRT
Metadata_UUID_Summary
Metadata_UUID_biomass
Metadata_UUID_chemistry
Metadata_UUID_dm
Metadata_UUID_phypig
Metadata_UUID_phytoplankton
Metadata_UUID_picoplankton
Metadata_UUID_rt
Metadata_UUID_tss
Metadata_UUID_zooplankton
Origin_Login
Origin_Password
Origin_Port
Origin_Schema
Origin_Server
ReportSchema
ReportingDatabase_Login
ReportingDatabase_Password
ReportingDatabase_Port
ReportingDatabase_Schema
Threddscatalog
VisualisationDatabase_Login
VisualisationDatabase_Password
VisualisationDatabase_Port
VisualisationDatabase_Schema
WfsServerURL
aatams_sattag_nrt_Schema
checkQC
day
directories
exclude
exclude_10secs
exclude_1sec
exclude_moorings
exclude_transects
ftp_mirror_dir
ftp_mirror_password
ftp_mirror_site
ftp_mirror_user
hour
include
include_data
include_metadata
include_reporting
jobName
keepDeletedFiles
latFiller
latMax
latMin
latQcVariable
latVariable
lftp
logDir
lonFiller
lonMax
lonMin
lonQcVariable
lonVariable
metadataDb_Login
metadataDb_Password
metadataDb_Port
metadataDb_RT_Login
metadataDb_RT_Password
metadataDb_RT_Port
metadataDb_RT_Schema
metadataDb_Schema
minute
onlyInclude
output_tmp_file
packages
paramFile
sourceCatalog
url
weekday
jkburges commented 9 years ago

I would really like the following to be discussed at the meeting but unfortunately I can't make it...

I am strongly in favour of not introducing a separate git repo for talend data bags, due to the complexity it would introduce (in terms of SCM workflow). Instead, I think the config that currently resides there can possibly be split in to things that can remain chef managed, and things which should go in harvester specific config (i.e. this repo).

Things which remain in chef should change rarely, whilst harvester config can be allowed to change as part of normal harvester development and deployment workflow, since it would be in this repo.

I haven't really had much to do with deploying harvesters etc, but my guess is that, in general, the DB connection details probably shouldn't change very often, and if we use a "convention over configuration" approach for schema naming, then new harvesters also wouldn't require talend data bag changes.

Other than that, file paths and what not, probably could be a combination of chef managed root locations, with harvester defined relative paths etc.

I have no idea what all the other config params are (!) but do they really need to be chef managed or would it make more sense for them to "live" with the harvester code (i.e. this repo)?