Open Gizmotronn opened 5 months ago
Great idea -> the method for verifying/assigning users other classifications is they have to classify all of the data on their planet (or they have the option to... unlocks a "completion status" thing/concept. Once we have more data for each anomaly [type] we can then reduce the threshold to like, say, 65%). This means that we then have a pipeline for having users classify things that have already been classified [by other users] to increase validity likelihood.
...Now we just need to link this in with the feed, somehow
An update:
The main logic for the cloud data is already such that there's automatic filling of data, and I briefly (for about 15 seconds) thought about writing a python snippet anyway so that we can pull the images from the anomalies.configuration
column into the storage buckets, but I realised that there's no point - just adds extra storage and I don't think the lag issues are as pronounced as they seem to be in the dev environment (in terms of having the images show up in the Camera Module). So we should be fine.
I'm going to move onto the surveyor module and come up with an automated response for each anomaly...
So far what I'm thinking for the surveyor module:
create table
public.anomalies (
id bigint generated by default as identity,
content text null,
"ticId" text null,
anomalytype text null,
type text null,
radius double precision null,
mass double precision null,
density double precision null,
gravity double precision null,
"temperatureEq" double precision null,
temperature double precision null,
smaxis double precision null,
orbital_period double precision null,
classification_status text null,
avatar_url text null,
created_at timestamp with time zone not null default now(),
deepnote text null,
lightkurve text null,
configuration jsonb null,
constraint baseplanets_pkey primary key (id)
) tablespace pg_default;
Here's an example content for the configuration
column:
{
"mass": 2.14,
"ticId": "Kepler-69",
"radius": 1.71,
"smaxis": 0.64,
"ticId2": "KOI 172.02",
"density": 2.36,
"gravity": 0.73,
"lightkurve": "https://qwbufbmxkjfaikoloudl.supabase.co/storage/v1/object/public/planetsss/_1710155300825",
"temperature": 325.15,
"orbital_period": 242.47,
"temperature_eq": 548.15
}
Each anomaly represents a planet, which obviously has data like the temperature of "325.15 Kelvin" or the orbital period, etc
Users can make classifications for any missing values or update any values, here's the schema for the classifications
table:
create table
public.classifications (
id bigint generated by default as identity,
created_at timestamp with time zone not null default now(),
content text null,
author uuid null,
anomaly bigint null,
media json null,
constraint classifications_pkey primary key (id),
constraint classifications_anomaly_fkey foreign key (anomaly) references anomalies (id),
constraint classifications_author_fkey foreign key (author) references profiles (id)
) tablespace pg_default;
a python snippet file/script that will pull in all classifications where
classificationtype
= any field matching the configuration element for the anomalies (in this case, it would be like mass, temperature, orbital_period, gravity, etc). So we should extract the "columns" from theconfiguration
field, then look throughclassifications
table for any rows whereclassificationtype
= one of those fields. Then, it should group all those classifications by theanomaly
&classificationtype
, average them out (e.g. if there's two classifications mentioning "gravity" foranomaly == 1
, it should average the values out)
I'm thinking about having some sort of way to "verify" changes, or add some extra values to anomalies.configuration
value for like "userSuggestions" or something. But for now just averaging them out and seeing the results locally would be a good start
Looks like a bit more debugging is required...
Each cloud datapoint should point to an anomaly of
anomalytype == planet
. Right now we'll just pull in images, randomly assign them and then upload them to the relevant bucket, and update the pointer of the cloud point in theanomalies
tableAlternatively, we could structure it that we pull in images in the cloud part of the storage bucket
/dir
of the setting anomaly, the classification points to the storage element so we don't need another field inanomalies
.So we can do some exploring. There won't be hugely complex algorithms, we'll maybe add an additional field to the cloud classification process later like "select a planet you know that looks like this cloud could be from" (in which case we'd need a field in
anomalies
tableExecutive decision: to achieve consistency, there will be a field in
anomalies
, however they will have a parentItem of a particular setting/planetanomaly
row