Closed jag3773 closed 5 years ago
@jag3773 I think now's the time to start looking into using a queue for processing things in the api. I don't have a good spot to put all of this processing without slowing down and possibly breaking things. I think we need to adjust the webhook to submit requests to a queue and then have a worker process items in the queue.
This is what I propose:
we could limp along with what we have by placing this processing in the webhook, but we're at a crossroads or close to it. We're bound to add more data intensive processing and I'm concerned about timeouts. Also, because of the nature of the webhook I can't use my same pattern for picking up where it left off after timeouts.
On the bright side this is the best place in the api to begin implementing a queue pattern. I won't have to touch the tS or uW api code.
Let's think about creating a new lambda that will be triggered when the webhook uploads something. This lambda will generate the usfm2 and upload it. The webhook will inject the usfm2 links into the catalog record. The signing and publishing will just fail and restart until this new lambda generates the usfm2 files.
Just noting that the usfm3 to usfm2 converter is completed with tests. I've removed any changes made to the existing lambda.
I'll wait to begin constructing a new lambda until we have our meeting regarding the improvements to the api.
@jag3773 here's an update on this.
I've set up a repo with code for creating a REST api at https://github.com/unfoldingWord-dev/tx. This includes documentation generation, a pattern for adding new RESTful services, and configuration for tests.
@ethantkoenig and I have come to the conclusion that it would be easiest to simply add the usfm3->2 code to the existing tx pipeline and configure the Door43 API pipeline to monitor an event in the Event Queue. I'd need to do that last step anyway.
That leaves us with a pretty repo that we're not actually going to use. I am quite pleased with it though so perhaps we'll be able to use it later or for something else.
All that said, @jag3773 are you pleased with the direction this is going? Should @ethantkoenig proceed with adding the converter to the existing tx pipeline and I configure coordination with the event queue?
Yes, that sounds fine @neutrinog and @ethantkoenig .
Here is some code that converts USFM3 to USFM2. At this point I don't think it covers all the new features in usfm3 but it's a starting point. https://github.com/unfoldingWord-dev/d43-catalog/blob/develop/libraries/tools/usfm_utils.py#L338
@jag3773 commented on Fri Aug 11 2017
For resources in our catalog that move to use USFM 3 we want to add a converter into the API pipeline that will convert them to USFM2 and add that as another format in the API. Intended result is 2 entries in the formats array, one for the original USFM3 version and one for the stripped down USFM2 version.
The operation should be triggered by the
formats
key in the manifest being set totext/usfm3
. In our ecosystemtext/usfm
equates to USFM2.In addition, the USFM2 text can be used for the tS/uW backwards compatibility generators.