Closed gnott closed 4 years ago
Related to https://github.com/elifesciences/decision-letter-parser/issues/43, where the bucket and S3 notifications were connected, from which this workflow will be triggered.
Thanks @gnott for creating this ticket.
Can you indicate which of the items to be ticked off that you cannot do without the internal eLife team?
I believe only https://github.com/elifesciences/issues/issues/5194 will be the highest priority item for eLife Internal - thanks @Melissa37 👍
This is feature complete now. I expect we'll do some improvements over time once people start using the decision letter ingest workflow.
To be done is to create a workflow and activities to process decision letter parser
.zip
files as they are added to a bucket.Summary
This issue describes an implementation of processing decision letter & author response content. The majority of the parsing logic is held in the
decision-letter-parser
code library. Theelife-bot
can be used to define and run the steps of the processing workflow.A
.zip
file, which contains a.docx
file and optionally some figure images and/or video files is copied to an S3 bucket. An S3 notification from that bucket is read by a queue listener in theelife-bot
. As a result, the bot will start a workflow. That workflow will have activities run in series to validate the input, generate JATS output from the.docx
file contents, copy file assets to another S3 bucket, POST the JATS output to an API endpoint. This workflow will notify people if it encounters an error, or when everything completed successful, it will notify people of the completed status.Bot workflow and activity build out in brief:
Workflow parts to build
IngestDecisionLetter
IngestDecisionLetter
IngestDecisionLetter
workflow execution when a file is uploaded to the input bucketActivities in this workflow to create
(separate issues can accompany each for more detailed discussion)
letterparser.py
perhaps, that can store any code to be shared by these activities, such as logic to unzip the contents, build from.docx
files, and suchExternal infrastructure required for completion
Assistance required from infrastructure team
letterparser.cfg
) that specifies the eLife DOI and file naming structurepandoc
utility (https://github.com/elifesciences/issues/issues/5194)Once built we can
continuumtest
environmentprod
environment and start using itMore tasks can be added to this issue as they are discovered or are defined in more detail.