This repository encompasses the Urban Institute College Affordability Static Site Generation Project.
This is a static site generator. Webpack handles the compilation and bundling of static assets for use in production. To get up to speed on this project it would be helpful to have a working understanding of how gulp and webpack work (this is the build/compiling technology), and how ReactJS works (this is the tool for building modular components, which is used extensively for the charts and pages). It is not crucial to understand them in depth or to be an expert, but it would be helpful to have a cursory understanding.
For each of the following steps, detailed instructions can be found by following the links.
develop
branch in order to view on http://collegeaffordability-stg.s3-website-us-east-1.amazonaws.comdevelop
branch into the live
branch. To be able to generate and serve the college affordability site
brew install node
sudo npm install npm -g
npm install
*Tested with node v6. Codeship is configured for node v6. If you are experiencing deployment issues, check your node version. To control the version of node in use, you can use a node version control manager like n
To set up the process to combine data and metadata into jsons for the above to use in charts
which Rscript
. install.packages(c("dplyr", "tidyr", "stringr", "Hmisc", "openxlsx", "readxl"))
You can reference gulpfile.js for the list of commands you can use locally, but to compile all your assets (and your new changes) run the following:
gulp
This compiles all of the information in pages
into the dist
folder.
In order to view the content of the site locally, cd into the dist
folder and then run:
http-server
Ideally, you would be able to incorporate a watch into your development workflow. If it worked, it would be utilized by running this command:
gulp watch
Unfortunately, this command is currently not working, due to its age and some out-of-date dependencies. I (Dan Wood) haven't had the chance to troubleshoot this, so gulp watch
is not working. But an enterprising developer could update/debug this command if so desired (perhaps Mark Sutton can fix this?).
Pro-tips
Refer to the EDITING.md in this repository for guidelines of how to edit and create charts and pages in the manual way. The site is made up of pages and components. As an editor, you will mostly edit files within the pages/
directory and point to elements within components to simplify certain content entry. See more there, if you aim to do things manually.
There are several processes for updating and editing different aspects of this project.
Editing text is as simple as finding the appropriate .jsx file within the pages/
directory and editing it to reflect the new text, using the HTML for things like paragraph, header, and italics tags. This should be done by the developer. Do not forget to update the text in the Research/Resources and Chart Index sections
Researchers are responsible for the editing and updating of the data. Finished data lives in the data_scripts/data/ folder. The researcher and developer may want to make this folder a shared box folder, if the researcher is unable to get push/pull access via github.
It is crucial that any csv files are kept in their original column format. This will prevent errors in the generation of charts from this data for more complicated forms of charts (stacked, grouped, toggle, etc). For understanding toggle chart organization, see this example file, for other complicated items, see working csv's to get a sense of the column format.
All 200+ charts in this website are created and updated using information from two sources: the csv files found in the data_scripts/data/ and each file's corresponding line in the GraphText excel file housed in data_scripts/scripts/ (note that the name of this file can change).
Once the data is updated and correctly formatted by the researcher, it can be referenced in the GraphText file. The researcher is responsible for making sure the following columns are filled out:
type, sources, titleIfSubtitleisMoved, title, notes, multiples, toggle, data_source, direct_labels, y_tick_format, subtitle, subtitle_for_multiples, x_label, y_label
A description of what each of these columns calls for is found in the CodeBook which outlines all column definitions and parameters. The developer is responsible for filling out all additional metadata columns, most of which modify the appearance of the chart (ticks, axes styles, padding, etc). The codebook will help you know which columns are required and which are optional. These modifications can be cumbersome and sometimes must be done iteratively. To learn more about that, please see [this workflow]().
By default, the names of these files are a generalized transformation of the title of the chart. For instance, a chart titled "Type of Employment over the Past Year among People with Only a High School Diploma or GED and Not Currently Attending School" will generate file names of "type_of_employment_over_the_past_year_among_people_with_only_a_high_school_diploma_or_ged_and_not_currently_attending_school.png". Please note that very long titles will get truncated after some number of characters. In this event, use the "imageOverride" flag in the GraphText file to rename the resulting savable files.
The data is uploaded and made downloadable automatically during the update process. There should not be any need for manual edits to these items.
NOTE: If any issues arise in the creation of the screenshots and data, there is a good chance it has to do with the naming of these files. If this happens, use the imageOverride flag to create a new, unique, and shorter file name
In section 8 (Student Profiles), there are several tables. These must be updated by hand in the .jsx file.
dist/
folderWhen a section of data are updated along with the corresponding GraphText rows (note: not all sections must be done to run this process), you will want to see how these changes appear on the site. The workflow is as follows:
data_scripts/data/*.csv data_scripts/scripts/GraphTextV2.xlsx
↘ ↙
run data_scripts/scripts/process_graphs.R
↓
(charts) pages/*/json/*.json pages/*/*.jsx (text)
↘ ↙
run gulp
↓
dist/ (production folder)
Whenever you've made changes to either the data or the GraphText file, you have to rerun the R script for the chart to be up to date in the pages/
folder. To have that change compiled in the browser, you must then run the gulp process to compile the pages/
components, data, etc. into static html pages in the dist/
folder.
Note: follow the directions above to get set up with this process.
The R script at data_scripts/scripts/process_graphs.R must be run in order to create the json files that contain the combined .csv info and the metadata (title, sources, styles, etc) found in GraphText. _Note that this script references its neighbor, data_scripts/scripts/makejsons_update.R, although you should not need to update this file._
When running process_graphs.R, whether in the terminal, RStudio, or in the bash file, you must make sure the following items are updated:
GRAPHTEXT_FILENAME <- "path_to_your_graphText_file.xlsx"
selectedSection <- 1
where 1 is some section number between 1 (What is College?) and 8 (Student Profiles). If you don't update these lines of code in the R script it will not update the correct information!!!
See "How do I get started" for more information, but basically, just type gulp
once your data/charts/text are all in pages/
When gulp has run, you will want to run a local server in the dist/
folder. I use http-server
but you may use whatever you want.
As in the previous step, ensure that your R script is referencing the correct GraphText.xlsx file and section. If you want to speed up the process outlined above, you can use a bash script that I have included in the home directory. The file, rebuild.sh
does the following:
1) Stops any local-server running on port 8080 (change to your port if desired)
2) Runs scripts/process_graphs.R
3) Runs gulp
4) Runs an http-server
instance on localhost:8080 from the dist/
folder
To run this file, go to the root directory and run
$ bash rebuild.sh
Optionally, you can run only the R file by using running this command in the root directory
$ bash runRscript.sh
Finally, if you want to stop your local server and re-run the rebuild.sh file, you may do so by pressing control-z
NOTE: This is deprecated in favor of our CI deployment pipelines (see below). If you want to deploy manually to the live s3 bucket, you will need to modify the gulp deploy
command in gulpfile.js.
Deploying is easy, however you do need to install the aws-cli tool to perform operations on the server including uploading and clearing out buckets. If this is your first time deploying, install aws-cli via the instructions here: https://aws.amazon.com/cli/
Once you have that all set, you need active credentials to hook up to a working S3 bucket. If that's set up (if not, contact your system administrator), simply instantiate the credentials within your system accounts' configuration (on a Mac, it's located in ~/.aws).
If all is good, you should be able to do deploys using the following command:
gulp deploy
The repository is hooked up to Urban's codeship account. It continuously deploys new changes whenever code changes are pushed to designated branches.
Currently, our deployment branches are as follows:
Since we are deploying to different servers with different access keys, we export the environment variables within each deployment pipeline. Here is an example of our deployment script.
export AWS_ACCESS_KEY_ID=[A Key ID]
export AWS_SECRET_ACCESS_KEY=[A Secret Access Key]
export AWS_DEFAULT_REGION=[A Default Region]
npm rebuild node-sass
gulp
aws s3 rm s3://collegeaffordability.urban.org/ --recursive;
aws s3 cp ./dist s3://collegeaffordability.urban.org --recursive
You can view and edit the Codeship Development Pipelines here
Reference: https://cwhite.me/continuous-delivery-for-your-static-site-with-codeship/
--When you are ready to deploy the site to production with your new S3 bucket set up, take note of the bucket name, aws_access_key_id, aws_secret_access_key, aws_default_region, and edit the gulpfile.js file to reflect the new configurations. Navigate to the task ">> gulp deploy-assets" (this quoted string should be a comment in the file), and update the command appropriately.--
Formerly: