LuteOrg / lute-v3

LUTE = Learning Using Texts: learn languages through reading. Python/Flask.
MIT License
413 stars 45 forks source link

Create an API to interact with the Lute backend #297

Open sakolkar opened 6 months ago

sakolkar commented 6 months ago

The Problem. Currently, a large amount of functionality between the front-end (flask templates) and the back-end are through form submissions. In order to decouple the front-end so that we can swap in another, such as React, the back-end would need to present an API where the front-end can interact.

Acceptance Criteria

Additional context

sakolkar commented 6 months ago

Plugged in this issue to track some efforts I'm going to go through in adding an API to Lute. This work of mine might not necessarily get merged but will hopefully be a first exploration of the effort involved

jzohrab commented 6 months ago

I like this issue a lot :-) it's the right way to do things. A couple of rough thoughts

Clearer "domain model" vs its usage (flask/web ui, api, command-line jobs)

I haven't pushed forward on this myself because I feel that some things should change in Lute architecturally. For a fictitious but accurate example, say the user hits endpoint "/book/do_something." Currently, this might boil down to the following modules used (generally there are objects involved):

* /book/do_something, defined in lute/book/routes.py
* maybe a call to some kind of "service" thing in lute/book/service.py
* maybe call some model code from lute/book/models.py
* call lute/models/Book.py for persistence

so the code looks like this:

lute/
   - models/     sqlalchemy mappers
   - book/               a "slice of the domain", if that makes sense
      - model.py      domain models
      - service.py     something that handles big groupings/operations
      - routes.py       flask routes
      - forms.py        flask-wtf forms (data structures and validation)

So, even though I've kept things reasonably disciplined with the code, there's still a fair amount of knowledge needed when figuring out exactly what to tweak where.

Originally, I was thinking that there should be a central "domain model" for Lute somewhere, but in the end all that would do would be to move all of the flask stuff somewhere, but still keep the existing structure:

lute/
   - book/
   - term/
   - ...etc
   - flask/    All of the flask endpoints

That could potentially be restructured further

lute/
   - domain/      (poor name, but whatever)
      - book/
      - term/
      - ...
    - flask/          (for the web app routes etc)
    - api/

... but maybe that's not needed. Perhaps it's as simple as adding an "api.py" file with flask routes, or if that needs to be partitioned further having separate api.py files in each separate module.

domain/api versioning

Need to sort out the versioning for the API for external clients. There are a few different ways to handle this, explained in this article: https://www.freecodecamp.org/news/how-to-version-a-rest-api/ -- I'm not sure which way is best. Google "flask rest api versioning" gives some tips as well, such as a library called "flask-rebar" (I think).

API testing

Perhaps those would be in a separate tests/api section.

patrickayoup commented 6 months ago

I would also appreciate this.

So far I have two use cases:

  1. I have thought about deploying Lute onto some small local machine like a Raspberry Pi for the sole purpose of being able to read from my iPad, I've hesitated from trying because I am not sure the current UI is too optimized for touch use. But having a clear API would allow the creation of a Mobile / Tablet optimized UI.

  2. I have some scripts which I wrote to extract content to bring into Lute. The last step would be to automate ingestion into Lute (although if this is just a simple form submission, I could probably just use that endpoint and submit my form data to that one endpoint without needing a full API redesign, @jzohrab would this form submission to create a text be simple enough in the current state of Lute, if so what would be the URL for that endpoint?)

jzohrab commented 6 months ago

would this form submission to create a text be simple enough in the current state of Lute, if so what would be the URL for that endpoint?

I guess the endpoint would be a post to /book/new (or similar), but flask has a csrf token for security (built-in to flask wtforms, https://flask-wtf.readthedocs.io/en/0.15.x/csrf/). I haven't tried posting via ajax but if you check the traffic on a regular post you should be able to recreate it.

sakolkar commented 6 months ago

@jzohrab So I think that perhaps restructuring the project should wait until after this is complete. I generally find you want to do that while not many other things are in progress, otherwise for any change (even minor ones) there is a large merge conflict. That's due to files having been moved in the target branch. And then if the target branch has large code changes like this issue would introduce, then resolving the merge conflict is a headache. TLDR... I think its best to limit restructure branches to just restructuring and not other code logic changes.

For the present, I would want to just add an api.py to each blueprint (books, etc...). This does mean for the interim it adds overhead on figuring out what all the API parts are but that can be alleviated by your next item.

Flask-Rebar. Honestly, this is exactly the kind of library I was thinking of. Over in Fast-API (alternative to flask) some of this is native to the framework and its a really nice way to properly document the API and quickly identify if a change will be breaking the API and you should up the version.

I think Flask-Rebar will work really well

jzohrab commented 6 months ago

Sounds good. Yes, restructuring in-flight would be a huge mess, and as you can tell my thoughts are half-baked on this anyway. I have a vague feeling that something is slightly off, but it's not enough to be concerned about.

As long as the api routes are clearly marked in the route files and included in the blueprints, and the APIs themselves have some end-to-end tests, it should be completely fine.

patrickayoup commented 6 months ago

would this form submission to create a text be simple enough in the current state of Lute, if so what would be the URL for that endpoint?

I guess the endpoint would be a post to /book/new (or similar), but flask has a csrf token for security (built-in to flask wtforms, https://flask-wtf.readthedocs.io/en/0.15.x/csrf/). I haven't tried posting via ajax but if you check the traffic on a regular post you should be able to recreate it.

Thanks, this works just fine. In case you want a snippet for any sort of cookbook recipes, here's some working python code to create a book.

import requests
from bs4 import BeautifulSoup
import json

_LUTE_HOST = 'localhost'
_LUTE_PORT = 5002
_ENDPOINT = '/book/new'

_URL = f'http://{_LUTE_HOST}:{_LUTE_PORT}/{_ENDPOINT}'

# First, start a session and make a GET request to /book/new
# so we can extract the CSRF token.
client = requests.session()
resp = client.get(_URL)
soup = BeautifulSoup(resp.text, features='html.parser')
csrf_token = soup.select_one('#csrf_token').get('value')

# Prepare the request body 
data = [
    ("csrf_token", (None, csrf_token)),
    # I only use one language, so it's easy, but you'll need to find the language id if you have multiple.
    ("language_id", (None, '1')),
    ("title", (None, "This is the title")),
    # ("text", (None, 'Use this field if you want to provide text')),
    ("textfile", ('text.txt', open('text.txt', 'r'), 'text/plain')),
    ("max_page_tokens", (None, '250')),
    ("source_uri", (None, "http://source/of/your/content")),
    ("audiofile", ('audio.mp3', open('audio.mp3', 'rb'), 'audio/mpeg')),
    # Tasks should be a list of dictionaries consisting of a "value" key.
    ("book_tags", (None, json.dumps([{"value": "testing"}])))
]

# Make the request
resp = client.post(_URL, files=data)
sakolkar commented 6 months ago

After trying out flask-rebar a bit, I decided to not use it. It result in a bit messier code. I instead used flask-smorest which is also based on marshmallow and by the same authors of that package.

These are the endpoint's I've penciled in and will flesh out more for their logic. image

I decided to condense some various routes into a single blueprint for the API:

  1. Skip backup route would be a PATCH on a backup to change its "status" or so to skip.
  2. Bing routes moved into the Language API endpoints where an image can be created by term text. It would accept either a URL or a manual image. Bing searching needn't be in the API and can be moved to front-end logic
  3. Several routes from "read", "useraudio", and "userimages" are moved into the Book API. Such as reading pages or setting bookmarks/current loc in the audio.
  4. Book Tags surfaced to match Term Tags.
  5. Language API pulled in routes from read and terms when these were endpoints by the "term text"

If anyone gets a chance to review these endpoints, let me know your thoughts!