Closed pburkholder closed 4 years ago
@adborden and I reviewed the docker-compose PR and will incorporate --fail
on curl, and test again end-to-end.
There are a few steps from the earlier issue to account for:
log_message
messages to stdout or stderr so cloud foundry logging can capturelocalhost:3000
max_
-related filesizes to arbitrarily low limits, and then assuring things work on the S3 logic path.sample
php files and references to env
-- let's just use development
to ensure adequate loggingAdding a note here on streaming logs, which I'll come back to after making sure the docker-compose
branch works in the CI environment (w/ Ansible).
Most PHP web apps will treat STDOUT as what gets passed to the web server and back to the client, which is why the log_message
and write_log
functions of CodeIgniter write to a specific path in application/logs. We should be able to use the fpm.d
tricks at https://github.com/cloudfoundry/php-buildpack/issues/256 and/or https://github.com/cloudfoundry/php-buildpack/issues/202 to set FPM_CATCH_WORKERS_OUTPUT to yes.
We'll also need to extend CI_Log class with MY_Log and override write_log
with our function that looks for $config['log_stdout']
to write to STDOUT.
I have ansible and test-kitchen working locally and in the CI environment, so I'm trying to reconcile how we configure dashboard between bsp/ansible and docker and cloud.gov. The WIPI branch at docker-compose-no-templating moves all the application variables to env_vars. The application/config/*.php
all are pulled from datagov-deploy
templates dir, and then use env vars with the expectation that:
Tuesday I should be able to test and then we can get the BSP env running on the new branch.
Some observations about use of S3:
S3 is used in a few places:
The archive_file
function will use config[use_local_storage] anytime it's called
but the logic doesn't apply when to datajson_lines
. The function has a specific exception for that.
S3 also used in the office_detail view and the office controller to build URLS that reference the objects in the archive.
This will require 2 buckets, since they have different ACLs attached.
Also, not all uploads use S3, or all use of archive, when use_local_storage
is false. S3 is only used in these particular cases. So json validation uses upload
but doesn't pay attention to use_local setting, for example
About some of the files:
public function csv_to_json($schema = null)
archive_file
which puts a fetch date in the URL.archive_file
which calls `archive_to_s3archive_to_s3
which calls put_to_s3 and stores with a PUBLIC aclput_to_s3
which stores private by defaultconfig/s3_bucket
I'm opening a PR since this is a good merge point before I start getting my hands in the PHP code. I've made a draft PR and need to fix circleCI and some other wonkiness, then I'll let you know when I'm ready.
I'm going to flesh out the what I've done and what needs to be done, then use my remaining hours this week, Monday and 1/2 of Tuesday to actually do those things.
Also, I'm pulling together a few notes for our conversation with Aidan tomorrow.
Current considerations for migrating to cloud.gov:
Logging
log_message
are not kept as config[log_threshold]
is set to 0 (zero) in BSP. This means you don't get any code-level errors, like:
ERROR - 2019-10-01 12:21:29 --> Severity: Warning --> fgetcsv() expects parameter 1 to be resource, boolean given /home/vcap/app/htdocs/application/helpers/csv_helper.php 44
ERROR - 2019-10-01 12:21:29 --> Severity: Warning --> Invalid argument supplied for foreach() /home/vcap/app/htdocs/application/helpers/csv_helper.php 48
ERROR - 2019-10-01 12:21:29 --> Severity: Notice --> Undefined variable: row_new /home/vcap/app/htdocs/application/helpers/csv_helper.php 52
log_message
to write to STDOUT, thenS3 updates to use authenticators
Explicitly setting credentials, e.g.:
$credentials = new Aws\Credentials\Credentials('key', 'secret');
$s3 = new Aws\S3\S3Client([
'version' => 'latest',
'region' => 'us-west-2',
'credentials' => $credentials
]);
S3 Sync all archives https://s3.amazonaws.com/bsp-ocsit-dev-east-appdata/datagov/dashboard/archive/* to the S3 bucket for cloud.gov. Resources:
Set up cron jobs.
index.php campaign status cfo-act download
(takes about 5m)index.php campaign status cfo-act full-scan
(takes a lot longer)cronish
table in the database has not had any updates in 24 hours.cronish
app that just does the download
job and then sleeps for 23h.worker
process keeps running regardless of the stability of the web
process, but uses the same droplet build.php index.php crawl | tee /var/log/dashboard-cron.log | mail -s 'LabsData - Dashboard' default_email
worker
pattern for the crawl labels all crawl logs with a worker id, e.g.: [APP/PROC/WORKER/0] OUT Attempting to request http://www.sba.gov/data.json
Run the dashboard app under /dashboard
, by launching with cf push --route-path /dashboard
config["base_url"]
. It may be sufficient to unset config["base_url]
and let CodeIgniter determine URLs internally, and have CloudFoundry rewrite for you. Or, you may want to emulate what's in the Ansible code with:
if (isset($_SERVER['REQUEST_URI']) && 0 === stripos($_SERVER['REQUEST_URI'], '/dashboard')){
$config['base_url'] .= '/dashboard';
$cookie_path_prefix = 'dashboard';
}
Write your SSP ;)
Proxy from https://labs.digital.gov/dashboard to https://(cloud.gov host)/dashboard
Tear down existing dashboard hosts
I'm updating the title of this so we can close it
All remaining work identified here is now in other issues:
The dashboard app is demoable on cloud.gov at https://app-rested-genet.app.cloud.gov/offices/qa
https://github.com/GSA/project-open-data-dashboard/pull/188 is the initial PR for demonstrating docker-compose and use of CircleCI for smoke tests