Aspects Learner Analytics combines several free, open source, tools to add analytics and reporting capabilities to the Open edX platform. This plugin offers easy installation, configuration, and deployment of these tools using Tutor <https://docs.tutor.overhang.io>
__. The tools Aspects uses are:
ClickHouse <https://clickhouse.com>
__, a fast, scalable analytics database that can be run anywhereApache Superset <https://superset.apache.org>
__, a data visualization platform and data APIOpenFUN Ralph <https://openfun.github.io/ralph/>
__, a Learning Record store (and more) that can validate and store xAPI statements in ClickHouseVector <https://vector.dev/>
__, a log forwarding tool that can be used to forward tracking log and xAPI data to ClickHouseevent-routing-backends <https://event-routing-backends.readthedocs.io/en/latest/>
__, an Open edX plugin that transforms tracking logs into xAPI and optionally forwards them to one or more Learning Record Stores in near real timeevent-sink-clickhouse <https://github.com/openedx/openedx-event-sink-clickhouse>
__, an Open edX plugin that exports course structure and high level data to ClickHouse at publish timedbt <https://www.getdbt.com/>
, a tool to build data pipelines from SQL queries. The dbt project used by this plugin is aspects-dbt <https://github.com/openedx/aspects-dbt>
.See https://github.com/openedx/openedx-aspects for more details about the Aspects architecture and high level documentation.
Aspects is a community developed effort combining the Cairn project by Overhang.io and the OARS project by EduNEXT, OpenCraft, and Axim Collaborative.
Note: Aspects is beta and not yet production ready! Please feel free to experiment with the system and offer feedback about what you'd like to see by adding Issues in this repository. Current details on the beta progress can be found here: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3861512203/Aspects+Beta
This plugin is compatible with Tutor 15.0.0 and later and is expected to be compatible with Open edX releases from Nutmeg forward.
Aspects is implemented as a Tutor plugin. Documentation will be coming soon to cover how to install Aspects in non-Tutor environments, but by far the easiest way to try and install it is via Tutor. These instructions assume you are running a tutor local
install, which is the fastest and easiest way to get started.
pip install tutor-contrib-aspects
tutor plugins enable aspects
tutor config save
tutor images build openedx --no-cache
tutor images build aspects-superset
tutor local do init
At this point you should have a working Tutor / Aspects environment, but with no way to create data! There are a few options for how to proceed.
--help
for usage):: tutor local do load-xapi-test-data
tutor local do dump-data-to-clickhouse --options "--object course_overviews"
tutor [dev|local] do transform-tracking-logs \
--source_provider LOCAL --source_config '{"key": "/openedx/data", "container":
"logs", "prefix": "tracking.log"}' \
--transformer_type xapi
# Note that this will work only for default tutor installation. If you store your tracking logs any other way, you need to change the source_config option accordingly.
# See https://event-routing-backends.readthedocs.io/en/latest/howto/how_to_bulk_transform.html#sources-and-destinations for details on how to change the source_config option.
tutor images build aspects-superset --no-cache
tutor local do import-assets
You should now have data to look at in Superset! Log in to https://superset.local.overhang.io/ with your admin account and you should see charts with your data.
Aspects maintains the Superset assets in this repository, specifically the dashboards, charts, datasets, and databases. That means that any updates made here will be reflected on your Superset instance when you update your deployment.
But it also means that any local changes you make to these assets will be overwritten when you update your deployment. To prevent your local changes from being overwritten, please create new assets and make your changes there instead. You can copy an existing asset by editing the asset in Superset and selecting "Save As" to save it to a new name.
tutor images build aspects-superset --no-cache
.Assets (charts/datasets) created for Aspects that are no longer used can be listed in
aspects_asset_list.yaml
. These assets & any translated assets created from them,
are deleted from Superset during init
(specifically import-assets
). The corresponding
YAML files are deleted during import_superset_zip
or and check_superset_assets
.
Sharing Charts and Dashboards ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To share your charts with others in the community, use Superset's "Export" button to save a zip file of your charts and related datasets.
.. warning:: The exported datasets will contain hard-coded references to your particular databases, including your database hostname, port, and username, in some cases it may also contain database passwords. It is vital that you review the database and dataset files before sharing them.
To import charts or dashboards shared by someone in the community:
databases
.Update the sqlalchemy_uri
to match your database's connection details.
.zip
file..zip
file.Contributing Charts and Dashboards to Aspects ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Superset assets provided by Aspects can be found in the templated
tutoraspects/templates/aspects/build/aspects-superset/openedx-assets/assets/
directory. For the most part,
these files are what Superset exports, but with some crucial differences
which make these assets usable across all Tutor deployments.
To contribute assets to Aspects:
installed.
Sharing Charts and Dashboards
tutor aspects import_superset_zip ~/Downloads/your_file.zip
attempt to warn you if there are hard coded connection settings where it expects template variables. These are usually in database and dataset assets, and those are often assets that already exist. The warnings look like:
WARN: fact_enrollments.yaml has schema set to reporting instead of a setting.
to use Tutor configuration template variables instead of hard-coded strings, e.g.
replace clickhouse
with {{CLICKHOUSE_HOST}}
. Passwords can be left as
{{CLICKHOUSE_PASSWORD}}
, though be aware that if you are adding new
databases, you'll need to update SUPERSET_DB_PASSWORDS
in the init scripts.
Here is the default connection string for reference::
clickhousedb+connect://{{CLICKHOUSE_REPORT_URL}}
their actual SQL. If you haven't changed the SQL of these queries (stored in
tutoraspects/templates/openedx-assets/queries
you can just revert that change back
to their include
values such as:
sql: "{% include 'openedx-assets/queries/fact_enrollments_by_day.sql' %}"
_roles
in dashboards. Superset does not exportthese, so you will need to manually add this key with the roles that are necessary to view the dashboard. See the existing dashboards for how this is done.
aspects-superset
image with tutor images build aspects-superset --no-cache
tutor aspects check_superset_assets
to confirm there are noduplicate assets, which can happen when you rename an asset, and will cause import to fail. The command will automatically delete the older file if it finds a duplicate.
tutor local do import-assets
and confirming there are no errors.
explanation of what data question they answer.
Virtual datasets in Superset ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Superset supports creating virtual datasets, which are datasets defined using a SQL query instead of mapping directly to an underlying database object. Aspects leverages virtual datasets, along with SQL templating <https://superset.apache.org/docs/installation/sql-templating/>
_, to make better use of table indexes.
To make it easier for developers to manage virtual datasets, there is an extra step that can be done on the output of tutor aspects serialize
. The sql
section of the dataset yaml can be moved to its own file in the queries
_ directory and included in the yaml like so:
.. code-block:: yaml
sql: "{% include 'openedx-assets/queries/query.sql' %}"
However, please keep in mind that the assets declaration is itself a jinja template. That means that any jinja used in the dataset definition should be escaped. There are examples of how to handle this in the existing queries, such as dim_courses.sql
_.
.. _queries: tutoraspects/templates/openedx-assets/queries/
.. _dim_courses.sql: tutoraspects/templates/openedx-assets/queries/dim_courses.sql
Releasing tutor-contrib-aspects ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Changelog, package version, PyPI release, and image building are all handled via manually triggered Githib Actions.
To trigger a build you must have access to manually trigger the "Bump version and changelog" action. This will update the version and changelog in a new PR. If the PR looks good, you can approve and merge it. Merging this PR will:
When the workflows are finished you should confirm that you see the new version on PyPI and images in DockerHub.