This project details technologies, scripts, and workflow for visualizing data about a research cluster using Open Source Technologies.
Most of the data is publicly available at Wikidata and we make use of Crossref, Open Refine, and Scholia to gather publication data about a list of authors, upload the data to Wikidata, and visualize it, respectively.
future-waters-viz
Docker container
The bulk of the project is available in a self-contained environment, namely a Docker container. Instructions on running docker are available below and also on the python scripts.
cluster-members.csv
file that must be copied in the /data-gathering/resources
folderAn example for key columns in your csv
is presented below:
Full Name | Affiliation | Position | Department | Faculty | Campus | wikidata |
---|---|---|---|---|---|---|
Ali Ameli | University of British Columbia | Assistant Professor | Earth Ocean and Atmospheric Sciences | Sciences | Vancouver | |
Alice Guimaraes | University of British Columbia | PhD Student | Norman B Keevil Institute of Mining Engineering | Applied Sciences | Vancouver | Q27980222 |
Gunilla Öberg | University of British Columbia | Professor | Institute for Resources Environment and Sustainability | Sciences | Vancouver | |
John Janmaat | University of British Columbia | Associate Professor | Economics,Philosophy and Political Science | Arts | Okanagan |
Note that the scripts are case sensitive and the input columns must match the ones provided in the example
cd data-gathering
docker build -t libraryrc/future-waters .
First get the path where you downloaded the project
pwd
The output will be something similar to /home/msarthur/Workspace/future-waters-project
Update the path in the volume argument in the command below, e.g.: -v /home/msarthur/Workspace/future-waters-project/resources:/tmp/src/resources
docker run --name=future-waters -v !!your path!!/resources:/tmp/src/resources libraryrc/future-waters
For example, for the output path that I got, the volume path should read:
docker run --name=future-waters -v /home/msarthur/Workspace/future-waters-project/data-gathering/resources:/tmp/src/resources libraryrc/future-waters
IMPORTANT The results from the scripts will be under the resources folder (in various subfolders).
sudo chown -R $USER:$USER resources
You need to remove previous named containers with the future-waters
identifier. Run
docker rm future-waters && docker run --name=future-waters -v !!your path!!:/tmp/src/resources libraryrc/future-waters
For example:
docker rm future-waters && docker run --name=future-waters -v /home/msarthur/Workspace/future-waters-project/data-gathering/resources:/tmp/src/resources libraryrc/future-waters
If there are updates on the python scripts, you must build a new image to reflect these changes on the container. To rebuild the entire pipeline, run:
docker rm future-waters && \
docker build -t libraryrc/future-waters . && \
docker run --name=future-waters -v
For example:
docker rm future-waters && \
docker build -t libraryrc/future-waters . && \
docker run --name=future-waters -v /home/msarthur/Workspace/future-waters-project/data-gathering/resources:/tmp/src/resources libraryrc/future-waters
IMPORTANT all other docker commands are executed inside a specific folder. This command should run in the project root folder
docker build -t libraryrc/future-waters-viz .
docker run --name=future-waters-viz -p 8100:8100 libraryrc/future-waters-viz
docker rm future-waters-viz && \
docker run --name=future-waters-viz -p 8100:8100 libraryrc/future-waters-viz
Remove last container, build and run new version in a single command
docker rm future-waters-viz && \
docker build -t libraryrc/future-waters-viz . && \
docker run --name=future-waters-viz -p 8100:8100 libraryrc/future-waters-viz
Check visualizations on your local browser at http://localhost:8100
Check some known issues in the project GitHub Web page
In case you encounter problems trying to replicate this project, please submit a new issue. When submitting an issue, maintainers would appreciate if you could disclose:
What is your Operating system?
What version of Python do you have installed?
Is there a stack trace or error log in the application console?