elifesciences / refract

Convert NLM XML files into the Lens JSON
Other
9 stars 3 forks source link

PLOS Deployment #2

Open michael opened 11 years ago

michael commented 11 years ago

Regarding the setup for hosted the PLOS-version of the converter. @ivangrub what do you think will be a good initial seed for the hosted converter? We shouldn't use too many docs as we might quickly run out of memory on Heroku. For the start, how about using the latest 100 articles per PLOS journal?

I need to figure out if it's possible to link one repository with different heroku remotes (I think it should). That way our deployment process would look like this:

For eLife (as usual)

git checkout master
git push heroku master

For PLOS

git checkout plos-dev
git push heroku-plos master

Then we'd have two services:

http://elife-converter.heroku.com/documents http://plos-converter.heroku.com/documents

@gnott you'd be able to implement a static deployment workflow in the same way as for eLife.

ivangrub commented 11 years ago

That sounds good to me. Just select 100 randomly from the plos_all_XML.txt file. Better to have a spread of a lot of different years for edge cases. On Jul 4, 2013 2:20 AM, "Michael Aufreiter" notifications@github.com wrote:

Regarding the setup for hosted the PLOS-version of the converter. @ivangrub https://github.com/ivangrub what do you think will be a good initial seed for the hosted converter? We shouldn't use too many docs as we might quickly run out of memory on Heroku. For the start, how about using the latest 100 articles per PLOS journal?

I need to figure out if it's possible to link one repository with different heroku remotes (I think it should). That way our deployment process would look like this:

For eLife (as usual)

git checkout master git push heroku master

For PLOS

git checkout plos-dev git push heroku-plos master

Then we'd have two services:

http://elife-converter.heroku.com/documents http://plos-converter.heroku.com/documents

@gnott https://github.com/gnott you'd be able to implement a static deployment workflow in the same way as for eLife.

— Reply to this email directly or view it on GitHubhttps://github.com/elifesciences/refract/issues/2 .

michael commented 11 years ago

Regarding edge cases: I think it's time to come up with a test suite, so we don't run into regressions anymore. It will save us a lot of time. I'll provide a testing setup for the converter some time next week.

We can use a simple strategy like converting a defined set of articles and run them through a verifier afterwards, that just checks for missing properties, or dead references.