This repository holds a set of tools and utilities for processing and cleaning Children's Services' data.
Most of the utilities are centred around three core datasets:
The LIIA (London Innovation and Improvement Alliance) project brings together Children’s Services data from all the Local Authorities (LAs) in London with the aim of providing analytical insights that are uniquely possible using pan-London datasets.
Please see LIIA Child Level Data Project for more information about the project, its aims and partners.
The package is designed to process data deposited onto the data platform by local authorities such that it can be used for analysis purposes.
This is a Dagster code server library which is setup to be used as a code server.
poetry install
.env.sample
to .env
and fill in the variables there as neededpoetry run dagster dev -f .\liiatools_pipeline\repository_la.py
poetry run dagster dev -f .\liiatools_pipeline\repository_org.py
pre-commit install
. This will ensure your code is formatted before you commit somethingHow this will run in production is that the library will be brought into a docker container
with configuration specified in the file Dockerfile_user_code
. Which code servers are used can
be specified in the installation.
See The SFDATA Platform's Workspace definition for details
The idea is each code server will have its own setup which will be a copy of what's here.
Note: Multiple libraries, pipelines, etc can exist in a single code server. Different servers should be used if they have conflicting requirements (e.g. different python versions)
Take a look at the documentation to understand what this code is designed to do and how to replicate it for your own dataset transformations. We recommend reading text first, followed by text.