ncfrey / rheed-viz

Database, dimensionality reduction, and visualization dashboard for RHEED data
2 stars 1 forks source link

custom data model backend #23

Open chris-price19 opened 3 years ago

chris-price19 commented 3 years ago

Migrate from compchem database to something specific for the rheed analysis application

DB plan

table 1 = sample columns: {sample_id, composition_string, metadata}

table 2 = rheed columns: {sample_id, timeindex, path, metadata} notes: storing path to the image file is best practice vs. storing bytestring, where image is stored on some filesystem

table 3 = pymatgen columns: {composition_string}

chris-price19 commented 3 years ago

metadata field is generally json / dict - however this is extremely inefficient to query on in a relational DB. any ideas on a different structure which enables time sensitive metadata?

explicit metadata table works but requires fixed schema per data stream and new tables for each data stream - reduces generalizability of queries

joining exp. data and materials project / pymatgen - composition string probably easiest, will return many pymatgen structures but easier to filter by formation energy or spacegroup after a join, when these are not known in experiment ahead of time

chan-w commented 3 years ago

any ideas on a different structure which enables time sensitive metadata?

A table structure like this could move the metadata queries out of the frontend and into postgres while allowing varying schemas for each data stream:

Column Name | Type -- | -- table_2_primary_key | id key | VARCHAR value | VARCHAR