ericleasemorgan / reader

Distant Reader, a tool for using & understanding a corpus
GNU General Public License v2.0
20 stars 7 forks source link

Merge the Cord19 repository #104

Closed dbrower closed 4 years ago

dbrower commented 4 years ago

This pulls in all of the files and changes from the https://github.com/ericleasemorgan/cord-19/ repository. (It also keeps the history, so if there are any outstanding branches in that repo, they can be merged in later).

This PR does not address path changes inside any scripts since the two repositories are checked out into different directories on the production cluster (/export/cord and /export/reader). I figured those can be adjusted once this merge is done.

A few files between the two repositories have the same name. I tried to resolve them by going with the newer file, but all these should be reviewed. The particular files I had to adjust are:

.gitignore
README.md. (kept both texts!)
bin/json2txt-pdf.sh
bin/metadata2sql.py
bin/txt2ent.py
bin/txt2ent.sh
bin/txt2keywords.sh
etc/schema.xml

@ericleasemorgan I made this PR based on our conversation a few weeks ago, and I agree that one repository is better than two. But if you have changed your mind, feel free to close this without merging.