New interactive content under development for the CER's pipeline profiles web page.
This project uses three primary technologies to create web based interactive dashboards and dynamic text specific to 25 of the largest pipelines regulated by the CER. The content is developed for both English and French. Here is a summary of the major front end frameworks used:
Sections being added:
pipeline_profiles
│ README.md (you are here!)
│ server.js (express js server configuration for npm start)
| profileManager.js (controls which sections and profiles are displayed)
| environment.yml (cross platform conda python 3 environment used in ./src/data_management)
│ webpack.common.js (functionality for creating clean ../dist folder in english and french)
| webpack.dev.js (webpack dev server functionality)
| webpack.prod.js (npm run build for minimized production files)
| webpack.analyze.js (npm run analyze-build to evaluate bundle size)
| .babelrc (babel config with corejs 3 polyfills)
| .vscode/settings.json (please use vscode for this project!)
| ...
|
└───test
| | test.js (AVA units tests for front end code, npm run test-frontend)
| | html5.js (runs html-validate on all .html files in /dist)
|
└───src
│ │
│ └───data_management
│ | │ conditions.py (creates conditions data for front end)
│ | │ incidents.py (creates incidents data for front end)
| | | traffic.py (created throughput & capacity for front end)
| | | tests.py (python unit tests npm run test-backend)
| | | util.py (shared python code module)
| | | updateAll.py (npm run update-all-data pull live data for all datasets)
| | | queries/ (contains queries used to get data from CER sql servers)
| | | raw_data/ (pre-prepared data used by python when not pulling from remote locations)
│ | │ ... other python files for pipeline datasets
| |
| └───components (handlebars partials)
| |
| └───css (main.css, transferred over to dist/css/main[contenthash].css via MiniCssExtract)
| |
| └───entry (entry points for all profile webpages)
| | | webpackEntry.js (specifies all the js and html entry points for /dist)
| |
| └───data_output (output data folders for each section. Contains prepared data ready for charts)
| |
| └───dashboards (Higher level files/functions for creating each dashboard)
| |
| └───modules (shared dashboard code & utility functions)
|
└───deploy (Prepares CER production files with new HTML sections in /dist)
│
└───dist (tries to match dweb7 folder structure)
│ en/ english js bundles & html for each profile (to be placed on web server)
│ fr/ french js bundles & html for each profile (to be placed on web server)
src/data_management
)cd Documents
git clone https://github.com/mbradds/pipeline-profiles.git
First time install:
npm ci
or
npm install
git checkout -b profile_improvement
npm run dev
This runs webpack.dev.js
Comment out all styles in src/css/cer.css
. These styles emulate some of the extra styles on CER pages, and dont need to be added.
npm run build
This runs webpack.prod.js and emits minified bundles in /dist
Note: npm run build && npm start
runs the express server using the production files. Test this on all major browsers prior to new releases.
Create a new release on GitHub and add the compressed dist folder. Ask the web team to dump the latest production files onto dweb7 and add the new dist files/changes before sending in a production web request.
There are two remote repositories.
I've added some convenient npm scripts for switching remotes:
npm run switch-remote-personal
npm run switch-remote-work
Unless you are running the code through an IDE, you will need to use the Anaconda Prompt to run the scripts, otherwise you will get the following error: 'conda' is not recognized as an internal or external command, operable program or batch file.
cd Documents
git clone https://github.com/mbradds/pipeline-profiles.git
npm install
Several datasets are pulled directly from CER internal databases. A single python file src/data_management/connection.py
handles the sqlalchemy connection parameters and strings. An untracked json file src/data_management/connection_strings.json
contains the hard coded database connection strings. A template file src/data_management/connection_strings.example.py
file is included with the connection strings left blank. Before running or contributing to the python code, you will need to open this file, add the connection strings, and save the file here: src/data_management/connection_strings.json
to ensure that connection info remains untracked.
It is highly recommended that you first create the conda python environment described in environment.yml. The npm scripts for data updates expect a conda python environment called pipeline-profiles. The easiest way to run the update data operation is through the anaconda prompt comand line. Using the Anaconda Prompt, run the following operations.
cd Documents
cd pipeline-profiles
conda env create --file=environment.yml
npm run update-all-data
The last operation npm run update-all-data
may take a few minutes to run. Once its completed, you will see all the output at once. If an error is encountered, the program will stop and display the error message. Feel free to try and fix the error or ask me.
You can re-use the same Anaconda Prompt shell from the last step. Run the following command to compile the front end code + the new json data. The code should complile, otherwise there is some kind of compatibility error between the data and the JavaScript code. This is usually the result of null values not being encoded properly. Feel free to fix the error in the python code and re-run step 3, or ask me to fix it.
npm run build
If the code compiles, then push the changes to main, and the test website will update after a few minutes.
git add .
git commit -m 'updated data'
git push
This continues to be a challenge because I control/update only a portion of the pipeline profiles, and there is no way for me to access the main production files or keep up with other changes through a version control system. Therefore my content and code needs to be merged with CER files and updated on the website very quickly to avoid a situation where others are working on the files. Also, there is also no way I can easily mimic the CER server environment for local development. In the abscence of an organizational version control system, its not realistic to use or mimic much, if any, CER infrastructure/files during the development process.
Up until recently (summer 2021) my approach to these constraints and problems was:
dewb7/data-analyis-dev
.dist/
folder into the correct location in the CER files.As of September 2021, I've added some automation in deploy/make_production_files.py
that largely cuts out the need to delete & copy/paste html sections. Here are the new steps:
npm run build
npm run deploy
deploy/web-ready
into dweb7/data-analysis-dev
(50 html files replaced).Adding a new section typically involves two major parts: The back end data (python), and the front end (JavaScript). Starting with the raw data, here is the typical pattern:
raw data (sql or web) -> python -> json -> es6 import -> JavaScript/css -> handlebars template -> translation -> release
Create a new python file in src/data_management
. Prepare a reliable connection to the dataset, either a remote datafile or internal sql. The profiles are segmented by pipeline, so the data prep will involve splitting the dataset by the pipeline/company column, and creating one dataset for each company. Output files in json format to ../data_output/new_section/company_name.json
.
Start to pay attention to file size of the outputs. Try to keep the average dataset around 15-20kb.
Start with just one profile (ngtl)
src/dashboards/newDashboard.js
.export function mainNewSection(data) {
console.log(data);
}
src/entry/data/ngtl.js
The data should (eventually) be made language agnostic.import canadaMap from "../../data_output/conditions/base_maps/base_map.json";
import conditionsData from "../../data_output/conditions/NOVAGasTransmissionLtd.json";
import incidentData from "../../data_output/incidents/NOVAGasTransmissionLtd.json";
import trafficData from "../../data_output/traffic/NOVAGasTransmissionLtd.json";
import apportionData from "../../data_output/apportionment/NOVAGasTransmissionLtd.json";
import oandmData from "../../data_output/oandm/NOVAGasTransmissionLtd.json";
import remediationData from "../../data_output/remediation/NOVAGasTransmissionLtd.json";
+import newData from "../../data_output/newSection/NOVAGasTransmissionLtd.json";
export const data = {
canadaMap,
conditionsData,
incidentData,
trafficData,
apportionData,
oandmData,
remediationData,
+ newData
};
src/entry/loadDashboards_en.js
:import { mainNewSection } from "../new_section/newSectionDashboard";
export async function loadAllCharts(data, plains = false) {
const arrayOfCharts = [
mainNewSection(data.newSectionData),
otherCharts(data.other),
];
}
Start the project with npm run dev
to open the webpack dev server on port 8000. Make sure that the data appears in the console, and you will be good to start developing the JavaScript.
Pretty soon after step 5 you will need to set up the html and css infrastructure. CSS can be added to src/css/main.css
. There is only one css file for the entire project. I might split this css file soon, but for now just keep all the css for each section roughly together.
Conditional handlebars templates are used to control which sections get loaded for each profile. This is one of the most complicated parts of the repo, but its powerful for a project like this. The logic in the remaining steps acts very similiar to a content management system. Here is why we are doing it this way:
Create a new handlebars template here: src/components/new_section.hbs
. For now, ignore the templates, and just write html with english text/paragraphs.
Add this new template to the profile manager here: profileManager.js
. It doesnt matter what you call the section, but remember it for the handlebars conditional later. It seems obvious that this file should be automatically generated based on which profiles have data for a given section, but i would prefer to leave this step manual. It adds an extra layer of protection agains sections getting rendered by mistake, and its easy to updata/maintain.
const profileSections = {
ngtl: {
sections: {
traffic: { map: true, noMap: false },
apportion: false,
safety: true,
new_section: true, // when set to true, handlebars will inject the section html
},
},
};
src/components/profile.hbs
. Once this is done, then npm run build
and npm run dev
should load your html.{{#if htmlWebpackPlugin.options.page.sections.new_section}}
<!-- Start New Section -->
{{> new_section text=htmlWebpackPlugin.options.page.text}}
<!-- End New Section -->
{{/if}}
npm run build
and npm run dev
, you should remove "fr" from the webpack.common.js
to avoid errors. Once you have added french to the data and code entrypoints (step 3 and step 4) then the js+html can compile in both dist/en
and dist/fr
const profileWebpackConfig = (function () {
- const language = ["en", "fr"];
+ const language = ["en"];
})();
Once you are done the new section, add all the JavaScript string to src/modules/langEnglish.js
and src/modules/langFrench.js
and all the html text/paragraphs to src/components/htmlText.js
. Follow the same logic for importing/templating found in other completed sections.
Write python unit tests: src/data_management/tests.py
and JavaScript unit tests: test/test.js
Create PR. I'll review all the code.
The greatest risk for errors, such as incorrect values appearing in the front end, are likely to happen as a result of errors in the "back end" python code. These python scripts compute large amounts of summary statistics, totals, metadata (number of incidents, most common, types, etc) from datasets that have inherent errors. This is made more risky by the fact that there are english and french datasets (only for conditions), and these datasets may have unrelated problems. Here is a list of embedded data errors I have noticed so far:
.strip()
on important text based columns.df['Company'] = df['Company'].replace({"Enbridge Pipelines Inc": "Enbridge Pipelines Inc."})
There are several python unit tests written for the various python data outputs and utility functions. These are found here src/data_management/tests.py
The python unit tests can be run through an npm script:
npm run test-backend
This code is difficult to test, because the code is run on data that updates every day, or every quarter. To simplify this, i have added static test data seperate from "production" data. The test data is located here: src/data_management/raw_data/test_data
. npm run test will test the python code on static data, where things like the correct totals, counts and other numbers that appear later on the front end are known.
The unit tests check a bunch of summary statistics and data validation metrics specific the the ngtl profile. It will also test to see if the english numbers/data are the same in french.
Test coverage is pretty low right now. Mainly focussing on major re-usable functionality in src/modules/util.js
and major calcualtions done on the front end like the five year average. I would like to move more general/pure functions to src/modules/util.js
so that they can be tested easier.
npm run test-frontend
npm start
.npm start
.test/test.js
)Note: the html-webpack-plugin and handlebars-loader is instrumental for this project. Older versions of this repo only had two templates, one for english and one for french. As the project grew, I needed a tempalte engine. A good example of this need is the apportionment section. There are only around 5 oil pipeline profiles with apportionment data (there could be more in the future though!) so i dont want to include the apportionment html in 20 profiles that dont need it, and then hide/show divs conditionally after the dom is ready. This probably causes layout thrashing. With handlebars, i can conditionally render components/sections based on the logic in profileManager.js
. Even better, with handlebars-loader, one html is compiled for each profile (web team can only handle html) and html-webpack-plugin still injects all the scripts.
This was the old way before handlebars:
Each pipeline profile webpage is essentially the same, but with different data. The two templates src/profile_en.html
and src/profile_fr.html
contain all the text and web resources (css, scripts tags) and the plugin injects the appropriate script tags for the profile. Changes made to these templates will appear on all 25 profile pages in english and french.
This is a long term project, and dependencies should be updated every so often. Run npm outdated
every so often. Regular updates to important dev dependencies like webpack and babel will likely improve compile time and code size. Updates to production dependencies like highcharts and leaflet will improve security and allow for the latest features to show up for users.
Making sure that all dependencies are updated and both package.json and package-lock.json are updated is kind of weird. Here are the steps to make it happen:
npm install -g npm-check-updates
ncu -u
npm install
Here is a list of things I'm stuck on and potentially need help with!
It looks like a runtime chunk is required based on the webpack pattern I've set up. Each profile has a runtime chunk that serves as the main entrypoint for the other chunks. I would like to avoid this if possible!
The core datasets are all pulled directly from Open Gov. I need to do this to maintain consistency with Open Gov, but connecting to CER databases would allow for really cool daily updates once the ci/cd pipeline is ready. This is going to take some time to migrate!
Take a look at the issues tab for a more up to date list. I dont update this section of the readme anymore.
{numConditions: [In Progress (int), Closed (int)]}
instead of {In Progress: int, Closed: int}
. Update: alot of this optimization has been done, but can be ramped up if needed.src/modules/langEnglish.js
but there are still instances of Highcharts.numberFormat scattered in the code.src/modules/dashboard.js
instead of hard coding the height. This will make it easier to add pills, and optimize this style of dashboard for mobile/smaller screens.src/modules/dashboard.js
. This might be adding more complexity than its worth.src/modules/dashboard.js
into a folder with one file for each class.src/profile.hbs
into src/components
. All templates should be kept together for readability.