globalbibletools / gbt

https://interlinear.globalbibletools.com
15 stars 2 forks source link

api/db design for languages #18

Closed arrocke closed 1 year ago

arrocke commented 1 year ago

acceptance criteria

Pertempto commented 1 year ago

I'm in the initial research stage of this... trying to find useful information and tools, not yet digging into actually designing our specific API.

Pertempto commented 1 year ago

Articles

This list will be updated...

Pertempto commented 1 year ago

API Specification Format Options

OpenAPI

Good comparison article

More points to be added, as I learn more...

Pertempto commented 1 year ago

@arrocke I haven't even dug in very deeply, but I think OpenAPI with swagger tools is the route we'll want to go. It's the one with the most community and online docs. AsyncAPI looks neat, their website really make me think they are developer-focused, but I don't think it would be useful unless we switched to an event-driven architecture.

arrocke commented 1 year ago

I'm really intrigued by openAPI and the swagger tools. If we could set up a process where APIs are document using OpenAPI, then types are generated for typescript and eventually dart, then we could be confident that all of our systems are conforming to the same API design. We would get the additional benefits that APIs are documented in advance so it will be easier for people to contribute.

arrocke commented 1 year ago

In terms of API design, here are a few principles that I would add to the article you posted that we should adhere to are:

Pertempto commented 1 year ago

That makes a lot of sense, I agree with your points. By "Simulated DELETE", you mean soft delete, correct?

arrocke commented 1 year ago

Yep. I wasn't sure what it was called

Pertempto commented 1 year ago

So to start out simple, a language will only have three fields?

Some of these questions aren't really relevant in terms of API design, but it doesn't hurt to start the conversation so we all understand each other better.

Pertempto commented 1 year ago

Do we really need to have a numeric primary key for languages? Why not just use the language code as the primary key? It would make the API access make a lot of sense, for example: GET /api/languages/en
As long as we decide on a standardized language code system like ISO 639-1, all the codes should be unique. From what I remember Sheldon saying, it didn't sound like the current system follows a standardized language code system.

Pertempto commented 1 year ago

@arrocke I found swagger-ui-react, and I thought it would be a great way to have easy access to our API documentation. However, when I tried to add the example code in ApiDocsView, I get four errors. They seem to be CSS or dependency related. Could you try out my 18-sketch-out-api branch? I'm not sure if the errors are caused by something in Swagger UI, or if they have to do with our nx/nextjs setup.

arrocke commented 1 year ago

It looks like the nx build isn't set up to deal with inline svg data urls. I tried googling around about how to fix that in webpack, and couldn't find much.

If all we want is an easy way to view this, there is an easier solution. There is a VS Code extension called Swagger Viewer that can do this for us. Of course we should eventually publish this for contributors, but when we do that, it should be a separate package in NX. We don't need to deploy the api docs with the main app.

What system of codes will we be using?

ISO 639-1 looks great. We aren't beholden to it, but its a great standard to set.

Do we really need to have a numeric primary key for languages?

The only concern for me is if we ever want to change the language code. If that is used as a foreign key, that would be messy. So my thought is maybe we use a an ID field in the database, but never expose it to the API and use the code for lookups instead. That way you could change the code and everything would still be connected.

Maybe we should talk to Andrew about whether that is necessary. If not, then we should just use the code. The only reason I can think that this would be relevant is if you want to make the code more specific. Like going from en to en-us

Probably not in English, but in that language?

Yes the name of the language should be in that language because that will be shown in the UI

Pertempto commented 1 year ago

The only concern for me is if we ever want to change the language code.

If we used a standardized language code system like ISO 639-1 we shouldn't ever need to change language codes. I think using an already established standard is going to make it nicer to work with any tools that might use the same language codes. Making up our own system seems like a bad idea but maybe I'm missing something.

arrocke commented 1 year ago

I totally agree on using a standard language code, but what I'm anticipating is if you want to change to a different code from the same standard. And the only time you would want to do so, is if you need to go from something like english (en) to US English (en-us) or Australian English (en-au). I think it is unlikely, for the sake of simplicity, I'm ok with making that the primary key

Pertempto commented 1 year ago

Yeah I get your point about country-specific codes now. We can totally do numeric ids underneath in our db and then expose it on the API by language code, it's not really going to make the database query any more complicated.

Pertempto commented 1 year ago

So my thought is maybe we use a an ID field in the database, but never expose it to the API and use the code for lookups instead.

I think this is the best course of action

BethedenMinistries commented 1 year ago

If I understand the discussion above, I don't foresee us ever having to change the language code to make it more specific. Every language will begin with as much specificity as possible. So for our interlinear in English, we would have it as en-us. The Spanish one would be specified to be Spain Spanish, etc. From there we can make copies of those gloss data sets and people can localize them to other kinds of Spanish or English, etc.