co-cddo / open-standards

Collaboration space for discussing and exploring technical and data standards
134 stars 18 forks source link

A standard for regulatory guidance and legislative documents #79

Open kevinxufs opened 3 years ago

kevinxufs commented 3 years ago

A standard for regulatory guidance and legislative documents

Category

Challenge Owner

Kevin Xu Technical Architect | Better Regulation Executive | Department of Business, Energy and Industrial Strategy (BEIS)

Michael Gribben Senior Policy Advisor | Better Regulation Executive | Department of Business, Energy and Industrial Strategy (BEIS)

Stephen Hodgson Deputy Director and Senior Responsible Owner | Better Regulation Executive | Department of Business, Energy and Industrial Strategy (BEIS)

Short Description

The Better Regulation Executive in BEIS currently run a project called the Open Regulation Platform (ORP) which aims to make regulatory guidance (from different regulators) and legislation documents (from Legislation.gov.uk) machine readable, and then to release this data to the public via an open API. The intention is that software developers in Government or the private sector will then use the ORP data to build RegTech tools and services. We are currently about to enter Alpha phase where will be looking to build prototypes of the service.

While legislation is very easy to process (legislation.gov.uk use an XML format), a significant challenge at present is that regulatory guidance comes in different formats - for example Word, PDF, Tables etc. As part of the project, we will be ingesting different kinds of documents with the intention to transform them into a single, machine readable format.

We are looking for help in defining the transformed format for regulatory guidance and legislative documents

Our Discovery research has suggested the best way to do this is:

ORP can define an input format 'ORPML' and enhance the ingestion API so that regulatory publishers would need to provide content in ORPML to benefit from enrichment. ​

ORPML is envisaged as a thin, high-level metadata wrapper over content provided in Akoma Ntoso (for primary/secondary legislation) or document markup for guidance notes. For primary legislation, the AN feed from The National Archives could be accepted and reformatted to be wrapped by ORPML

Link to Akomo Ntoso: http://www.akomantoso.org/

Link to Legislation.gov.uk API: https://www.legislation.gov.uk/developer

User Needs

The ORP is targeted primarily at third party software developers who can use the data provided to create bespoke services to users dealing with regulatory information. In order to provide this data to developers, we need to process the various documents into a standardised format. We should then be able to offer developers versions of our documents in relevant machine readable formats (e.g. JSON, XML).

Our discovery research indicated two primary developer personas we should focus on were Data Scrapers and RegTech Software Developers.

Data Scrapers collect, standardise and enrich libraries of regulatory data. They are looking for easier access to high-quality regulatory data enriched with useful authoritative metadata.

RegTech Software Developers create technology products to solve compliance needs using regulatory information. They need metadata and regulatory data further enriched and tagged at the paragraph level.

Other important users were:

Expected Benefits

Alongside our Discovery, we did an economic analysis to calculate expected benefits.

Total benefits could be around £359m ±£181m over 10 years

This assumes: a take-up of 5-10% of businesses using a tool developed from the ORP, and these tools saving them 20% of the time they would usually spend finding information about regulations.​

Functional Needs

samsmith commented 3 years ago

Given the expandability of Akoma Ntoso, is there any detail on why it hasn't been (and can't be) extended to cover guidance too?

MatthewWaddington commented 2 years ago

This looks like a very interesting initiative. My particular interest is in "Rules as Code", but also in AKN/etc (the difference as I see it being that AKN/etc capture the identity of a provision, whereas RaC tries to capture its logical structure to help with its meaning). I would echo samsmith's question - is there something about AKN that stops it being used for the regulator codes-of-practice & guidance? Have you talked to Angus Moir (https://www.linkedin.com/in/angus-moir-37b2aa43/) from Bank of England about their RaC work on the rules they produce, which have to knit in with UK (& EU) legislation?

fitsilisf commented 2 years ago

Certainly an interesting initiative, but -on the policy side- it seems to duplicate some existing efforts from the European Commission (see https://joinup.ec.europa.eu/collection/better-legislation-smoother-implementation) and the OECD (see https://oecd-opsi.org/projects/rulesascode/). On the implementation side, there are already some available standards, ontologies and approaches, e.g., AKN, ELI and rules as code, respectively. as well as several platforms and tools, too many to mention. We at the Hellenic OCR Team are actively supporting the above initiatives, while gradually developing our own integrated application suite based on state of the art legal informatics concepts to support parliamentary and governance institutions. Check out GitHub for more information.

MattiSG commented 2 years ago

Interesting! Thanks for sharing this reflection in the open 🙂

It is not entirely clear to me reading through this description whether the aim is to end up with a set of accessible, metadata-augmented regulatory documents, or with machine-readable legislation.

In the first case, Cicero from the Accord Project provided us @AmbaNum a good abstraction over document formats, independently from its templating capability. In terms of examples, you might have a look at the French LEGI database, along with its sisters from other legislative sources: KALI, ACCO, JORF, CAPP, CASS, INCA, JADE, CNIL, CONSTIT. I don't know what's the level of availability of legislative documents in the UK, but a critical first step is to unify the URIs for unambiguously mentioning a document. ELI is a simple, well documented, and production-ready standard for that, deployed in several countries already (FR, LU, NL…) and that is not specific to the EU no matter its name 😉 Potential outcomes for such a project usually includes AI training datasets to assist jurists and policymakers. In such a case, these datasets would benefit from in-depth (technical and legal) defence against potential detrimental automated decision-making. You might also be interested in the Ethics Charter for an Online Legal Market published under the umbrella of @OpenLawFR.

In the second case, you are aiming for something that sounds more like Rules as Code, as mentioned by @MatthewWaddington and @fitsilisf. If that is the expected outcome, it seems unlikely that a fully automated, normalised publishing based system would yield usable deliverables. You might be interested in the work from @PSLmodels, who started using @OpenFisca to model the UK tax and benefits system publicly on https://github.com/PSLmodels/openfisca-uk/. In that case, I would be happy to discuss how OpenFisca could help you in that endeavour 🙂

Best of luck towards that important step!

DidacFB-CDDO commented 2 months ago

A refreshed version of the challenge has been opened (February 2024). Please se the last challenge under the "issues" section