pwyf / data-quality-tester

Test IATI activity files against PWYF index methodology
http://dataqualitytester.publishwhatyoufund.org
MIT License
2 stars 2 forks source link

Add some technical docs on how the DQT works #45

Closed michaelwood closed 4 years ago

michaelwood commented 4 years ago

Co-author: Jared Parnell jared.parnell@opendataservices.coop

michaelwood commented 4 years ago

Thanks for the review.

Line 29 Is there a way of editing out the repetition? "overview view" and multiple 'which'-es. Will everyone reading this know what JS XHR is?

Yes they should know what that means.

Line 64 Doesn't the DQT use the Index Indicator Definitions repository for the tests themselves too? eg. https://github.com/pwyf/2018-index-indicator-definitions/tree/master/tests

Yes it is actually part of the d-q-t repository as a git submodule.

Line 74 First mention of the cron job. Does this need more context?

Don't think so, it is an optional item and people should know what cron means.

All the other things you mentioned were fixed and are now updated in the pull request. Full set of changes below:

diff --git a/docs/technical notes.md b/docs/technical notes.md
index 57a0850..7443c82 100644
--- a/docs/technical notes.md   
+++ b/docs/technical notes.md   
@@ -14,19 +14,21 @@ Handled by: upload.html

 ## 2. Uploading the Data

-After HTTP POST upload a SuppliedData object instance is created which is a SQLAlchemy model. This object contains the data, location on disk and various metadata items as well as dealing with creating a unique id (uuid) for this test run and triggering a Celery task for downloading the data if it is hosted remotely. This object is then stored in the database. 
+After HTTP POST upload a SuppliedData object instance is created which is a SQLAlchemy model. 
+
+This object contains the data, location on disk and various metadata items. It also creates a unique id (uuid) for this test run and triggers a Celery task for downloading the data if it is hosted remotely.

 Handled by: views/uploader.py , models.py, (tasks.py download_task)

 ## 3. Processing the Data

-The upload page redirects to the package overview page, which contains the unique id, the view function for this page starts a Celery task to run the tests on the values provided by looking up the unique id in the database.
+The upload page redirects to the package overview page, which contains the unique id. The view function for this page starts a Celery task to run the tests on the values provided by looking up the unique id in the database.

 Handled by: views/quality.py, overview.html

 ## 4. Updating the status of tests 

-The quality overview view renders the overview template which contains JS XHR which poll the status API endpoints every 2 seconds. The url for the endpoints is stored in the data property of the HTML elements representing the tests.  For example data-status-url="/task/42f0bca1-84fc-46fd-9740-e2299aabdc75"
+The quality overview view module renders the overview template which contains a JS XHR (ajax) which polls the status API endpoints every 2 seconds. The url for the endpoints to poll is stored in the data property of the HTML elements representing the tests. For example data-status-url="/task/42f0bca1-84fc-46fd-9740-e2299aabdc75"

 This queries the Celery task of the id given and returns the status for all the tests in that group. 

@@ -59,11 +61,13 @@ Handled by: tasks.py, views/quality.py, overview.html

 ## 5. Test Results

-The original data file to be tested is stored to the media folder (as configured in config.py)  this folder will also contain the test results and summary JSON file for each test group. The media folder is named with the unique id of the test run (a uuid).
+The original data file to be tested is stored to the media folder (as configured in config.py). This folder will also contain the test results and summary JSON file for each test group. The media folder is named with the unique id of the test run (a uuid).

 The testing process is done with the [BDD_Tester tool](https://github.com/pwyf/bdd-tester) that uses the Index Indicator Definitions ([e.g. 2018](https://github.com/pwyf/2018-index-indicator-definitions)) repository for all cucumber/gherkin style feature files and step definitions. The repository is a submodule in the repository. 

-The output from the testing is stored in the CSV files alongside the original data file, and the package summary page retrieves the record from the database by UUID and this model provides the file locations to the results CSVs. These files are used to generate the context for the package, project attributes, or test pages.
+The outputs from the tests are stored in CSV files alongside the original data file.
+
+The package summary page retrieves the record from the database via the UUID, this record provides the file locations to the results CSVs. These files are used to generate the render context for the package, project attributes and test templates/pages.

 Handled by: views/quality.py, overview.html

@@ -71,7 +75,7 @@ Handled by: views/quality.py, overview.html

 There is a Flask command ‘flask flush-data’ which will delete Test results files from the media directory that are older than 7 days or all using --all.

-This command can be run in the cron job.
+This command can be run in a cron job.

 Handled by: command.py

@@ -83,7 +87,7 @@ Options for running the Data Quality Tool

 [Installation instructions](https://github.com/pwyf/data-quality-tester/blob/develop/README.md)

-Redis can be installed from you system’s package manager. Redis server version 4.0.9 is known to work.
+Redis can be installed from your system’s package manager. Redis server version 4.0.9 is known to work.

 ## Developer install in Vagrant

@@ -102,7 +106,7 @@ Using SaltStack:
     secret_key: htohg1Za6doh[y6Enge!zeej8air6ew|oh^tuFigheex&ei^Y
 \`\`\`

-3. Append an entry to ./salt-config/roster called pwyf-dqt-[something] : e.g. 
+3. Append an entry to ./salt-config/roster called pwyf-dqt-[something] : e.g.

 \`\`\`yaml
 # Publish What You Fund (PWYF)