This solution starter was created by technologists from IBM.
We currently live in an environment where we are finding more instances where law enforcement agencies are racially biased and are conducting unlawful practices and policies (e.g. "dirty policing"). The practices and policies are shaping the methodology by which data is created which increases inaccurate, skewed or systemically biased information. The lack of transparent and accurate data available to assess police behavioral infractions means, police reports can be falsified and covered up.
From traffic stops and arrests to sentencing and parole decisions, using technology leads to better analysis of real-world data, provides insights, and makes recommendations that will drive racial equality and reform across criminal justice and public safety. Policing agencies should consider piloting technology to improve operational efficiencies and outcomes. Given that current times have led to decreased number of resources and increased public scrutiny of law enforcement activity, technology is a solution that could address these issues. Internal affairs and civilians (such as witnesses) can both contribute to incident reports, creating a record with all accounts of the incident.
The intersection of IBM Watson APIs and Hyperledger Blockchain technology enables users to ensure consistency of information and a secure immutable permissioned ledger of documents. As more agencies are trying to show transparency and ensure that their officers are all operating above board; using AI and Blockchain can help to facilitate those objectives.
The lack of faith in our policing system has citizens looking for justice and a need to regain faith in those who are paid to protect us.
With the Incident Accuracy Reporting System, law enforcement can once again build that faith by creating a platform where all those involved or witnesses can provide relevant documentation about an incident.
Specifically, the platform is a content management application for capturing statements, videos, and audio feeds from first-hand individuals relating to police incident reports. It provides a(n):
Note This project is currently scoped for the United States. We fully expect that international parties will eventually contribute to extend the scope to support law enforcement agencies globally.
If a 911 call is made, the event’s address, date, and time is logged as an incident.
An incident may or may not be given a case number
Incidents reported via app can be linked by provided case number (if known), or by correlating the submitted metadata(location, date, and time) with a logged incident from the police department
If an event has not yet been logged as an incident by the police department (ex: occurring live at a protest), reported incidents submitted through app can be tagged as pending until a matching police incident can be linked once filed.
Content Management application backed by artificial intelligence and a distributed ledger that holds immutable data in a secure and encrypted way that insures documents can never be altered
Watson Speech To Text: This is an IBM Cloud service that converts human voice into written text. This application uses it to create written transcripts from audio recordings loaded by witnesses and victims.
Watson Language Translator: Translates text from one language to another
IBM Watson Studio: Build, run and manage AI models.
IBM Cloud Object Storage: (lite tier) stores encrypted and dispersed data. Documents(incident reports, videos, audios) are saved on IBM Cloud Object Storage.
Vue.js: An open-source model-view-viewmodel front end JavaScript framework for building user interfaces and single-page applications
Docker: Docker is a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers. Containers are isolated from one another and bundle their own software, libraries and configuration files; they can communicate with each other through well-defined channels
FFmpeg: an open source, cross-platform solution to record, convert and stream audio and videos. This was used to transcribe the audio portion of a video.
Blockchain: Based on the open-soure enterprise grade permissioned distributed ledger; Hyperledger Fabric. The blockchain network is an immutable transaction ledger. This was used to ensure that the submitted reports and information from victims and witnesses are secure. Each stored document has a hash tag that is stored in the Blockchain. You can run the application without it.
scikit-learn: Scikit-learn open source machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
k-Means clustering is an unsupervised machine learning algorithm which does not require a labeled dataset with an input-output mapping. It is preferred for this project due to its efficiency in documents analysis by grouping them into clustered partitions based on the content of the documents. It also eliminates the cost of labeling a dataset and saves the time for doing so. Each document represents a datapoint. Specifically in this case, each witness' statement is a document which forms a datapoint in a cluster.
There are different feature extraction methods that can be used in text analysis problems such as bag of words, binary representation and term frequency - inverse document frequency (TFIDF) etc. TFIDF is selected for this project because of its efficiency in finding similarity groups in a set of documents. Term frequency (TF) is the fractional measure of the number of times a word or a term appears in a document while inverse document frequency (IDF) measures how imortant a term is, since frequency does not necessarily mean importance. The final output from the feature extraction is presented to the kmeans algorithm in vectorized form. Prior to the feature extraction, stop words which are regarded as noise are removed and stemming is used to reduce the words to their root words.
The desired number of clusters is first specified. In this project, the number of clusters with an optimal clustering performance is automatically generated using the elbow method. A centroid is then randomly initialized for each cluster. A centroid refers to a positional mean value. Datapoints are attached to the clusters based on their closeness to the centroids. After each iteration, a new centroid is calculated for each cluster by finding the mean position of all the current points in the cluster. The process is repeated until the centroid position converges and all the points have been assigned. The clustering performance was measured using the silhouette score. Perfectly separated clusters will have a silhouette score of 1 while severely overlapping clusters will have a silhouette score of 0. The figure below is an animation showing the k-Means iteration process.
Source: https://realpython.com/k-means-clustering-python/
Each document is an incident statement from a witness. If a police report is available, it will also be added to the dataset. If there is consistency between the police report and the statements from the witnesses, the datapoint representing the police report will appear in one of the clusters, otherwise, it will stand out as an outlier, not belonging to any cluster. The clusters in this case could mean that the witnesses shared different aspects of the incident or reported the incident from different perspectives. It could also mean that different incidents happening in the same location around the same time were reported by different witnesses. In otherwords, this project could also be used to tie a police report to the appropriate incident. The figure below shows some simulated witnesses dataset and an unrelated police report. It is clear observed that the police report (large green circle) stands apart and not consistent with any of the two clusters.
Docker - Container-based software that standardizes a unit of software and enables developers to isolate their app from the environment
Node.js - a JavaScript Framework
Register for an IBM Cloud account.
Install and configure IBM Cloud CLI.
Clone this repository locally. In a terminal, run:
git clone https://github.com/Call-for-Code-for-Racial-Justice/Incident-Accuracy-Reporting-System
Create a .env
file by copying the Incident-Accuracy-Reporting-System/backend/STT-Audio/env-example
in the same directory.
Provision the following services:
Add the credentials for both the Speech to Text
and Language Translator
service to the .env
file you created earlier. Note you will add the credentials for the ObjectStore
service when you prepare to deploy the Blockchain app in step 4.
# Environment variables
SPEECH_TO_TEXT_IAM_APIKEY=
SPEECH_TO_TEXT_URL=
LANGUAGE_TRANSLATOR_IAM_APIKEY=
LANGUAGE_TRANSLATOR_URL=
After provisioning the Object Storage, Speech to Text and Language Translator services, we'll need to then deploy a blockchain ledger. This ledger will keep track of all digital assets that have been uploaded. There are two ways to deploy a ledger, either locally or in the cloud. For this pattern, we will focus on deploying the Blockchain locally.
Local Deployment
First clone the repo that contains the Blockchain application code:
git clone https://github.com/IBM/Blockchain-for-maintaining-Digital-Assets
Make sure you don't have any old Hyperledger Fabric images lying around.
Check this by issuing the following commands:
./stopFabric.sh
docker rmi -f $(docker images -q)
docker ps
docker images
You should see no images listed at this point.
Update the following file installChaincode.sh
found in the Incident-Accuracy-Reporting-System/backend/blockchain/local
directory
Update the variable CONTRACT_PATH
with the path of where you installed your Blockchain-for-maintaining-Digital-Assets
repo. As an example:
export CONTRACT_PATH='/Users/laurabennett/2020/EMBRACE-Challenge'
You will first need to configure your IBM Cloud Object Storage instance and during this process you will add in your IBM Cloud Object Storage credentials to your the config.json
file find in the following directory: Blockchain-for-maintaining-Digital-Assets/server/config
.
Buckets
in the left hand navigation pane and click on Create bucket
. Choose Standard
under Predefined buckets. Provide a Unique bucket name
as per the naming rules specified. Skip the Upload files
step and click Next
. Skip the Test bucket out
step and click Next
. Once the bucket is successfully created, obtain the following information from the webpage:
Bucket details
, obtain the bucket name and specify it as cos_bucketName in the config.json.Service Credentials
, obtain the apikey and resource_instance_id values and specify them as the cos_apiKeyId and cos_serviceInstanceId respectively in the config.json.Endpoints
, obtain the Public
endpoint and specify this value as the cos_endpoint in the config.json fileYou will now change to the directory where you installed Blockchain-for-maintaining-Digital-Assets
.
Once all this is done, your Blockchain-for-maintaining-Digital-Assets/server/config/config.json
should look something like this:
{
"channel_name": "mychannel",
"smart_contract_name": "blockchain-for-maintaining-digital-assets",
"connection_file": "Org1MSP_profile.json",
"appAdmin": "app-admin",
"appAdminSecret": "app-adminpw",
"orgMSPID": "Org1MSP",
"caName": "184.172.229.220:31844",
"peerName": "184.172.229.220:30884",
"gatewayDiscovery": { "enabled": true, "asLocalhost": false },
"smtpHost": "smtp.mailtrap.io",
"smtpPort": 2525,
"smtpUserName": "cb49e25f8cbe5f",
"smtpPassword": "3734c09cfdj05f",
"senderEmail": "no-reply@digitalassetscodepattern.com",
"cos_endpoint": "s3.us-south.cloud-object-storage.appdomain.cloud",
"cos_apiKeyId": "QrC2rLBkjEmS755xR88_78seDgD2ai8DIQxVd74G21Je",
"cos_serviceInstanceId": "crn:v1:bluemix:public:cloud-object-storage:global:a/86ac1b16b6f8b9639124a38d8edbd301:2f8d9627-46ff-46e9-a053-9d3e7121eedf::",
"cos_bucketName": "blockchain-digital-assets-bucket"
}
This step will start the network in a series of docker images, create a network channel and join a peer.
cd Incident-Accuracy-Reporting-System/backend/blockchain/local
./startFabric.sh
Upon successful completion, you should see the following:
2020-10-13 17:32:46.599 UTC [channelCmd] InitCmdFactory -> INFO 001 Endorser and orderer connections initialized
2020-10-13 17:32:46.612 UTC [cli.common] readBlock -> INFO 002 Received block: 0
2020-10-13 17:32:46.862 UTC [channelCmd] InitCmdFactory -> INFO 001 Endorser and orderer connections initialized
2020-10-13 17:32:46.948 UTC [channelCmd] executeJoin -> INFO 002 Successfully submitted proposal to join channel
This step will install, instantiate and test a smart contract on the network
cd Incident-Accuracy-Reporting-System/backend/blockchain/local
./installChaincode.sh
Upon successful completion, you should see the following:
+ LANGUAGE=golang
+ export CONTRACT_PATH=/Users/laurabennett/2020/EMBRACE-Challenge
+ CONTRACT_PATH=/Users/laurabennett/2020/EMBRACE-Challenge
+ echo 'Copying Chaincode to cli container'
Copying Chaincode to cli container
+ docker cp /Users/laurabennett/2020/EMBRACE-Challenge/Blockchain-for-maintaining-Digital-Assets/contract/ cli:/opt/gopath/src/github.com/asset
+ echo 'Install and Instantiate Chaincode'
Install and Instantiate Chaincode
+ docker exec -e CORE_PEER_LOCALMSPID=Org1MSP -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin@org1.example.com/msp cli peer chaincode install -n asset -v 1.0 -p /opt/gopath/src/github.com/asset -l node
2020-10-13 18:04:56.309 UTC [chaincodeCmd] checkChaincodeCmdParams -> INFO 001 Using default escc
2020-10-13 18:04:56.309 UTC [chaincodeCmd] checkChaincodeCmdParams -> INFO 002 Using default vscc
2020-10-13 18:04:56.335 UTC [chaincodeCmd] install -> INFO 003 Installed remotely response:<status:200 payload:"OK" >
+ sleep 5
+ docker exec -e CORE_PEER_LOCALMSPID=Org1MSP -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin@org1.example.com/msp cli peer chaincode instantiate -l node -o orderer.example.com:7050 -C mychannel -n asset -v 1.0 -c '{"Args":[]}' -P 'OR ("Org1MSP.member")'
2020-10-13 18:05:01.608 UTC [chaincodeCmd] checkChaincodeCmdParams -> INFO 001 Using default escc
2020-10-13 18:05:01.608 UTC [chaincodeCmd] checkChaincodeCmdParams -> INFO 002 Using default vscc
+ echo 'Chaincode Instantiated'
Chaincode Instantiated
+ sleep 10
+ echo 'Test Chaincode'
Test Chaincode
+ docker exec -e CORE_PEER_LOCALMSPID=Org1MSP -e CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin@org1.example.com/msp cli peer chaincode invoke -o orderer.example.com:7050 -C mychannel -n asset -c '{"Args":["queryAllDigitalAssets"]}'
2020-10-13 18:05:40.265 UTC [chaincodeCmd] chaincodeInvokeOrQuery -> INFO 001 Chaincode invoke successful. result: status:200 payload:"[]"
Once the blockchain ledger is up and running, we'll deploy an application to track uploaded media on the ledger. This works by taking a hash of a file and storing it on the blockchain ledger. Then, the original file is placed in a Cloud Object Storage instance. If the file is tampered with in the Object Storage, it will no longer match the hash, and an alert will be generated.
server
directory: cd Blockchain-for-maintaining-digital-assets/web-app/server/
npm install
npm start
client
directory: cd Blockchain-for-maintaining-digital-assets/web-app/client/
npm install
npm run serve
Once both the server and client have successfully started, the UI can be accessed at http://localhost:8080/?#/.
Main page of Blockchain application:
cd Incident-Accuracy-Reporting-System
Start frontend web app
cd frontend
npm install
npm run serve
Start backend
cd backend
npm install
npm start
Incident Accuracy Reporting System (IARS)
This solution starter is made available under the Apache 2 License.