EgregiousPackaging / Egregious

A web application to help track excess packaging/complain to companies about it
1 stars 0 forks source link

Database/barcode storage #5

Open Rthe1st opened 2 years ago

Rthe1st commented 2 years ago

Decide on how/where to store the barcodes we receive. SQLite is a possibility to avoid needing to standup/pay for a standalone db (and making it easy to migrate to another SQL DB in future)

Also decide on what data we want to store at first.

jonlinnell commented 2 years ago

we could use SQLite, where would you store the file?

We should also think about how we're going to access the data, and maybe come up with an entity relationship model or something fancy.

On the no-SQL front, I've built quite a few things with AWS DynamoDB over the last few years, and it's capable of accommodating some pretty extreme use cases. I'd vote for that at this stage tbh.

Rthe1st commented 2 years ago

Storing the file, anywhere on our VPS instance, probably out side the project folder and persist it using a volume. In terms of what the data should look like, I'm thinking very simple right now:

barcodes
----
barcode: TEXT
reported_at: TEXT -- holding dates as IS0-8601

Turns out sqlite doesn't have a proper date time, but has functions that work with ISO-8601 formatted string for working with dates/times.

https://www.sqlite.org/datatype3.html#date_and_time_datatype

I'm thinking we round reported_at to the nearest day, because it's not particularly useful to know the exact times when barcodes are reported (any pattern's we're interested on at least multi day timescales), and the less data on reporters we can get away with, the better.

I'm totally open to not-sqllite though, I've not used it or DynamoDB before so no preference. My only reservation on dynamo would be, does that mean our app need to move to AWS as well? Probably to avoid ingress/egress fees. Maybe we should of gone all in on AWS at the start, the idea of setting up accounts for us all now isn't sparking joy lol

Another option is postgres (or other DBs) on OVH https://www.ovh.co.uk/cloud-databases/

OVH's biggest drawback is cost, sqlite is free, and dynamo I assume has a free tier.

My choice would be which ever we can set up quickest.

I'll do some research on the OVH db side.

Later, I could see use wanting to store extracted information about barcodes and more data on reports. Perhaps we work towards something like this:

reports                                  barcodes

report_id INTEGER PRIMARY KEY    ┌──────►barcode_id INTEGER PRIMARY KEY
reported_at: TEXT                │       barcode    TEXT
barcode_id INTEGER ──────────────┘       company    TEXT
complaint_format                         format     TEXT
complaint_link TEXT -- for tweets etc
image BLOB
                                             ▼

(made with https://asciiflow.com/#/)

Rthe1st commented 2 years ago

We've got a sqllite db, using volumes more simple then proposed above, only one table and manufacturer info is not stored in it (still hard coded)