Koeng101 / dnadesign

A Go package for designing DNA.
Other
23 stars 0 forks source link

Uniprot API #20

Closed Koeng101 closed 10 months ago

Koeng101 commented 10 months ago

I would like to create a Uniprot API to start on the idea outlined in #15 . Uniprot is one of the most useful biological databases out there, so this will be a useful exercise in making the LLM-enabled biological databases.

The absolute key is going to be a reliable deployment environment, so that I can walk away and it keeps working. In that vein, I think the following are key steps:

  1. Make SQLite schema for Uniprot. I'm thinking of this just being a JSONB + entryID for now.
  2. Make a build process for making the SQLite database
  3. Figure out devops to make sure the build process can be reliably run every 8 weeks, uploading to dnadesign.bio/downloads/uniprot_sprot.db

From there, the API launch process would be:

  1. Download / mount uniprot_sprot.db from dnadesign.bio/downloads/uniprot_sprot.db.
  2. Start Golang API

In order to make this reliable, I think kubernetes is going to be the right abstraction layer. I'm going to have to be launching quite a few of these APIs, and this is the only way I can think of to front-load the energy of getting the services to work. Ie, these workflows cannot be pets, they need to be cattle.

I'm thinking digitalocean k8s for now, until we want to slurp up RefSeq - which will then require a custom server. Going to think more about this.

Koeng101 commented 10 months ago

Closing now to focus on the current PRs. may open later.