This is the repository for two related websites:
Both sites use the same code and database, stored in this repository.
access-portal/
contains the PHP code for the website.sql/
contains the data stored in MySQL format.This section is only relevant to people who want to run a local instance of AI Watch (e.g. for development purposes).
First, clone the repo and set up the database:
git clone https://github.com/riceissa/aiwatch.git
cd aiwatch
mysql -e "create database aiwatch"
make read # read in data from sql/
Now set up the password file to allow PHP to log in to the database:
cp access-portal/backend/globalVariables/{dummyPasswordFile.inc,passwordFile.inc}
vi access-portal/backend/globalVariables/passwordFile.inc # change to add database login info
If you're hosting this on a server, make sure to disable public access to the password file.
Finally start the service:
cd access-portal
php -S localhost:8000
To get AnchorJS and tablesorter, run:
make fetch_anchorjs
make fetch_tablesorter
You can now visit http://localhost:8000/
in your browser.
Before running commands in Git Bash, make sure you are in the AI Watch directory. If you cloned the git repo from your home directory, you can go to the AI Watch directory by typing the following:
cd ~/aiwatch
This only needs to be done once every time you open Git Bash, whereas the other commands below must be done every time you want to save changes to git.
Open an existing .sql file or make a new .sql file in which to add new data.
For example, to add data to 80,000 Hours positions, you can open
sql/positions/80000hours-positions.sql
.
Add data to the file (using a text editor such as Notepad) by looking at sources like Wayback Machine for the team page, LinkedIn, blog posts, Facebook, etc. For the data format, look at the other files in the same directory and copy the format.
A few things to keep in mind:
'2016-02-01'
. MySQL will complain if you only add year or month.Once you reach a good stopping point, check for errors in the MySQL syntax. Type:
./reload-db.sh
This will reload all of the data in MySQL. MySQL will complain if there are any errors, and you can fix them. If MySQL says nothing, that means all the data has correct syntax.
WARNING: reload-db.sh
only reloads files that are explicitly
listed in the script, so if you make a new SQL file it will by default not be loaded
into the database. If you make a new SQL file, you must add a line similar to the following
one in reload-db.sh
:
winpty "$MYSQL" --defaults-extra-file="$HOME/.my.cnf" aiwatch -e "source sql/positions/80000hours-positions.sql"
After you edit and save reload-db.sh
, you will need to commit changes to Git:
git add reload-db.sh
git commit -m "New SQL file"
Stage the file in git. Type
git add FILENAME
where FILENAME
is the name of the file you have been editing. If there are
multiple files, you can either type git add FILENAME1 FILENAME2
or you can
type one line for each file.
To find all files with changes, you can type:
git status
Then you can copy the path that is shown there.
Commit your changes. There are two ways to do this:
Simple way. You can type:
git commit -m "your message here"
where your message here
is a summary of the changes you made. For
example, you might type:
git commit -m "Add 80,000 Hours team from 2012"
Using a text editor (this is better if you want to write a multi-line commit message). You can type:
git commit
This will open a text editor for you to type your commit message. In Git
Bash, the default text editor is Vim. You can press i
(which enters
insert mode) and then type in your message (e.g. "Add 80,000 Hours team
from 2012") and then press Escape
. Then press :wq
followed by Enter
(this will save the file and close Vim).
Upload your changes to GitHub. You can type:
git push
If you've finished working on an issue, add the ready_for_review
label on
the issue.
Some tricky things to keep in mind:
The very first time you set up Git, you will need to do the following:
git checkout -b BRANCHNAME
to start a new branch.git push --set-upstream origin BRANCHNAME
to set the branch to push to on
origin (GitHub).To be able to run SQL commands, you must first open a MySQL command prompt. Open Git Bash, then type the following (or copy-paste the command; to paste a command into Git Bash, right click the Git Bash window and select Paste):
winpty "/c/Program Files/MySQL/MySQL Server 8.0/bin/mysql" --defaults-extra-file="$HOME/.my.cnf" aiwatch
Now you can type the following SQL commands. To exit out of MySQL, you can
type Control
-d
.
The following command looks at each organization and finds distinct people with the same last name. This will often catch people who use shorted versions of their name in some contexts (e.g. "Alex Ray" and "Alexander Ray").
select organization,group_concat(distinct person) from positions group by organization, substring_index(person,' ',-1) having count(distinct person) > 1;
The following command will find duplicate (organization, person, title) combinations. This often means that the same position has been entered more than once.
select count(*),person,organization,title from positions group by person,organization,title having count(*)>1;
You can restrict the above to a single organization, say GiveWell, by typing the following instead:
select count(*),person,organization,title from positions where organization='GiveWell' group by person,organization,title having count(*)>1;
The following command will find all cases where there is more than one
"unended" position for each (person, organization) pair. Since each person
usually only holds one position at a time, this can often tell you of positions
that are missing an end_date
that should be added.
select organization,person,group_concat(title) from positions where end_date is NULL group by person,organization having count(*) > 1 order by organization;
The following command flags organizations whose last position date
(i.e. the latest date out of all start_date
and end_date
values)
is later than the last major data update date:
select o.organization, o.last_major_data_update_date, group_concat(p.person, ': ', greatest(p.start_date, p.end_date) separator '; ') as violations from organizations o inner join positions p on o.organization = p.organization where o.last_major_data_update_date < greatest(p.start_date, p.end_date) group by o.organization, o.last_major_data_update_date;
Example:
The idea here is that if the list of violations for an org grows too long (say, 10+ positions), then it becomes pretty obvious that the last major data update date forgot to be changed when the data was updated.
Most keyboard shortcuts like copy, paste, save, search, all work as expected. Below are some advanced settings that are useful when doing data entry.
Control
-Shift
-End
. This will select all lines from the
current cursor position until the end of the file.Control
-p
followed by >
followed by the word sort
and then
choose "Sort Lines Ascending" from the menu.WARNING: Be careful when sorting lines to make sure that there are no commented lines or SQL "rows" that take up more than one line, as these will be sorted incorrectly. Also when sorting a block of SQL "rows" for the whole file, add a comma to the first row, then sort, then remove a comma from the first row.
To toggle word wrapping, press Alt
-z
.
All contributions are welcome. Contributions might take the form of:
All code and data released to the public domain according to CC0 1.0 Public Domain Dedication.
To export some organization data from the donations database:
mysql donations -e "select donee,former_name,website,contact_email,country, \
key_people,launch_date,launch_date_precision,launch_date_url, \
facebook_username,instagram_username,medium_username,pinterest_username, \
timelines_wiki_page,tumblr_subdomain,twitter_username,wikipedia_page, \
youtube_username from donees where cause_area REGEXP 'ai risk';" \
> temp.tsv
Find uncategorized interns:
select person,title,employment_type from positions
where title regexp 'intern' and (employment_type != 'intern' or employment_type is null);