elifesciences / elife-bot

tools for creating an automatic publishing workflow.
MIT License
19 stars 10 forks source link

Decision letter parsing workflow #965

Closed gnott closed 4 years ago

gnott commented 4 years ago

To be done is to create a workflow and activities to process decision letter parser .zip files as they are added to a bucket.

Summary

This issue describes an implementation of processing decision letter & author response content. The majority of the parsing logic is held in the decision-letter-parser code library. The elife-bot can be used to define and run the steps of the processing workflow.

A .zip file, which contains a .docx file and optionally some figure images and/or video files is copied to an S3 bucket. An S3 notification from that bucket is read by a queue listener in the elife-bot. As a result, the bot will start a workflow. That workflow will have activities run in series to validate the input, generate JATS output from the .docx file contents, copy file assets to another S3 bucket, POST the JATS output to an API endpoint. This workflow will notify people if it encounters an error, or when everything completed successful, it will notify people of the completed status.

Bot workflow and activity build out in brief:

Workflow parts to build

Activities in this workflow to create

(separate issues can accompany each for more detailed discussion)

External infrastructure required for completion

Assistance required from infrastructure team

Once built we can

More tasks can be added to this issue as they are discovered or are defined in more detail.

gnott commented 4 years ago

Related to https://github.com/elifesciences/decision-letter-parser/issues/43, where the bucket and S3 notifications were connected, from which this workflow will be triggered.

Melissa37 commented 4 years ago

Thanks @gnott for creating this ticket.

Can you indicate which of the items to be ticked off that you cannot do without the internal eLife team?

gnott commented 4 years ago

I believe only https://github.com/elifesciences/issues/issues/5194 will be the highest priority item for eLife Internal - thanks @Melissa37 👍

gnott commented 4 years ago

This is feature complete now. I expect we'll do some improvements over time once people start using the decision letter ingest workflow.