jats-to-mediawiki.xsl transforms XML files written in the NLM/NISO Journal Archiving Tag Suite XML into MediaWiki XML. It is a part of the Encyclopedia of Original Research (EOR), and we expect it to be tightly integrated with the open-access-media-importer.
For more documentation, see the wiki.
The following examples assume you are working in a bash
shell.
The jats-to-mediawiki.py python script provides a robust and human-friendly interface, including streaming
using stdin, stdout, and stderr. Article IDs can be passed to the script as stdin,
listed by line in an input file -i
, or are as arguments to the -a
or --articles
flag.
After cloning the repository to a new local copy, set up the Python run environment with the following commands:
virtualenv env/
source env/bin/activate
pip install -r requirements.txt
Subsequently, when starting to work on the project in a new shell, you will need to source the environment's activate script:
source env/bin/activate
For command line usage, use python or otherwise execute the script with a --help
flag
python jats-to-mediawiki.py --help
The instructions below assume you'll use xsltproc
to run the XSLT transformation.
To see if it exists on your system:
command -v xsltproc
# Set up XML catalog file
export XML_CATALOG_FILES=`pwd`/dtd/catalog-test-jats-v1.xml
The following are manual instructions for converting a single article, given its DOI.
First, find the PMCID for the article. If you have the DOI (for example,
10.1371/journal.pone.0010676
) the easiest way to do this is with the PMC ID converter
API. Point your browser at
http://www.pubmedcentral.nih.gov/utils/idconv/v1.0/?ids=10.1371/journal.pone.0010676&format=json,
and make a note of the pmcid
value (in this example, PMC2873961
).
Next, find the location of the gzip archive file for this article, using the PMC OA web
service. Point your browser at
http://www.pubmedcentral.nih.gov/utils/oa/oa.fcgi?id=PMC2873961,
and look for the link with format tgz
.
Download that gzip archive with, for example (note the single quotes around the URL):
wget 'ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/e7/55/PLoS_One_2010_May_21_5(5)_e10676.tar.gz'
Unzip that, and change into that directory. For example,
tar xvfz 'PLoS_One_2010_May_21_5(5)_e10676.tar.gz'
cd 'PLoS_One_2010_May_21_5(5)_e10676'
Find the NXML file with ls *.nxml
. Now convert it with, for example
xsltproc ../jats-to-mediawiki.xsl pone.0010676.nxml > PMC2873961.mw.xml
In a browser, go to the Special:Import
page of your target mediawiki installation,
and import it.
You could use the scripts/fetch_samples.sh script to grab several examples articles, which were used in testing.
Is on our Github wiki.
We're using the Github issue tracker for bug reports and to-do items.
Join our Google Group jats-to-mediawiki
This work is in the public domain and may be used and reproduced without special permission.